oc new-project workshop-scale-up-down-YourName
7 Application Scaling
Introduction
In this exercise we will learn how to scale our application. OpenShift has the capability to scale your application based on the workload.
First we will deploy an application, scale it up and down manually. We will also learn Idling which is a unique concept in OpenShift.
Then we will learn to setup horizontal pod autocaler that automatically scales your application based on the workload and scales down when the workload reduces.
Deploy an Application to Scale
Create a new Project
Replace YourName with your name before creating the project.
Deploy an application*
We will deploy a simple PHP application that has index.php
which does some computation. You can deploy using web console or CLI. Steps below show the deployment using CLI by running oc new-app --image-stream=php --code=https://github.com/RedHatWorkshops/hpademo
as shown below
oc new-app --image-stream=php --code=https://github.com/RedHatWorkshops/hpademo
--> Found image 5303cd8 (3 months old) in image stream "openshift/php" under tag "7.3" for "php"
Apache 2.4 with PHP 7.3
-----------------------
PHP 7.3 available as container is a base platform for building and running various PHP 7.3 applications and frameworks. PHP is an HTML-embedded scripting language. PHP attempts to make it easy for developers to write dynamically generated web pages. PHP also offers built-in database integration for several commercial and non-commercial database management systems, so writing a database-enabled webpage with PHP is fairly simple. The most common use of PHP coding is probably as a replacement for CGI scripts.
Tags: builder, php, php73, rh-php73
* The source repository appears to match: php
* A source build using source code from https://github.com/RedHatWorkshops/hpademo will be created
* The resulting image will be pushed to image stream tag "hpademo:latest"
* Use 'oc start-build' to trigger a new build
--> Creating resources ...
imagestream.image.openshift.io "hpademo" created
buildconfig.build.openshift.io "hpademo" created
deployment.apps "hpademo" created
service "hpademo" created
--> Success
Build scheduled, use 'oc logs -f bc/hpademo' to track its progress.
Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
'oc expose svc/hpademo'
Run 'oc status' to view your app.
Running into Resource Limits
By default the build will fail with an Out of Memory (OOM) error. This is because of our default memory limits.
Make sure you are logged in to the OpenShift console, the instructions can be found in Chapter 2.
Run the following commands to fix the problem, please note that your results may differ.
## Select the project $ oc project <Your Project> ## Now we need to find the builds $ oc get builds NAME TYPE FROM STATUS STARTED DURATION hpademo-1 Source Git Pending ## If the build is still running, cancel it first: $ oc cancel-build hpademo-1 ## Afterwards, we need to patch the BuildConfig, which is the name of the build without the "-1" $ oc patch bc/hpademo --patch '{"spec":{"resources":{"limits":{"memory":"1Gi","cpu":"1000m"}}}}' ## Now, start a new build $ oc start-build hpademo ## You can check it's status again by running oc get builds $ oc get builds NAME TYPE FROM STATUS STARTED DURATION hpademo-1 Source Git Cancelled (CancelledBuild) 51 seconds ago 11s hpademo-2 Source Git@122d440 Running 19 seconds ago
This will start an S2I build for the php application. You can run oc get builds
and also watch the logs by running oc logs -f hpademo-1-build
. You know drill by now!!
Create a route by exposing the service
$ oc expose svc hpademo
route.route.openshift.io/hpademo exposed
Run oc get route
to get the URL for your application. Eventually your application gets built and deployed.
$ oc get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
hpademo hpademo-$yourProject.apps.cluster.chp4.io hpademo 8080-tcp None
If you curl the URL you will see that the index.php
page does some computation and displays OK!
curl hpademo-$yourProject.apps.cluster.chp4.io
OK!
Scaling
Understanding Replicas Setting in Deployment Configuration vs Replication Controller
Check the deployment configuration for this application by running oc get deployment/hpademo -o yaml
and focus on spec
...
spec:
...
replicas: 1
...
...
You’ll notice that the replicas:
is set to 1
. This tells OpenShift that when this application deploys, make sure that there is 1 instance running.
Manual Scaling
To scale your application we will edit the deployment
to 3.
Open your browser to the Topology page and note you only have one instance running. It shows when you hover over the deployment.
Now scale your application using the oc scale
command (remembering to specify the dc
)
$ oc scale --replicas=3 deployment/hpademo
deployment.apps/hpademo scaled
If you look at the web console and you will see that there are 3 instances running now
Note: You can also scale up and down from the web console by navigating to overview page and clicking twice on up arrow right next to the pod count circle to change replica count.
On the command line, see how many pods you are running now:
$ oc get pods NAME READY STATUS RESTARTS AGE hpademo-1-2cz8m 1/1 Running 0 8m24s hpademo-1-7tcz6 1/1 Running 0 8m24s hpademo-1-build 0/1 Completed 0 29m hpademo-1-deploy 0/1 Completed 0 27m hpademo-1-zl2ht 1/1 Running 0 27m
You now have 3 instances of hpademo-1
running (each with a different pod-id).
Idling
A related concept is application idling. OpenShift allows you to conserve resources by sleeping the application when not in use. When you try to use the application it will spin up the container automagically.
Idling the application
Run the following command to find the available endpoints
$ oc get endpoints NAME ENDPOINTS AGE hpademo 10.128.2.37:8443,10.129.2.29:8443,10.130.2.28:8443 + 3 more... 37m
Note that the name of the endpoints is hpademo
and there are three ip addresses for the three pods.
Run the oc idle endpoints/hpademo
command to idle the application
$ oc idle endpoints/hpademo The service "scaling-user1/hpademo" has been marked as idled The service will unidle Deployment "scaling-user1/hpademo" to 3 replicas once it receives traffic Deployment "scaling-user1/hpademo" has been idled
Go back to the web console. You will notice that the pods show up as idled.
At this point the application is idled, the pods are not running and no resources are being used by the application. This doesn’t mean that the application is deleted. The current state is just saved.. that’s all.
Reactivate your application
Now click on the application route URL or access the application via curl.
Note that it takes a little while for the application to respond. This is because pods are spinning up again. You can notice that in the web console.
In a little while the output comes up and your application would be up with 3 pods (based on your replica count).
So, as soon as the user accesses the application, it comes up!!!
Scaling Down
Scaling down is the same procedure as scaling up. Use the oc scale
command on the hpademo
application deployment
setting.
$ oc scale --replicas=1 deployment/hpademo
deployment.apps/hpademo scaled
Alternately, you can go to project overview page and click on down arrow twice to remove 2 running pods.
Auto Scaling
Horizontal Pod AutoScaler (HPA) allows you to automatically scale your application based on the workload. It updates replicacount
by watching the workload.
Set Resource Limits on your application
HPA requires your pods to have requests and limits set so that it knows when to scale the application based on the consumption of resources.
Let us update the deployment to set the resources by running oc set resources
$ oc set resources deployment hpademo --requests=cpu=200m --limits=cpu=500m
deployment.apps/hpademo resource requirements updated
We have set the CPU request (initial allocation) as 200
millicores and limit (maximum allocation) to 500
millicores. So when we ask HPA to scale based on percentage workload, it measures based on these numbers.
Set up HPA
Now we will create HPA by running oc autoscale
command
$ oc autoscale deployment hpademo --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/hpademo autoscaled
Here we are did two things:
-
cpu-percent=50
indicates that when the CPU usage (based on requests and limits) reaches 50%, HPA should spin up additional pods -
--min=1 --max=10
sets upper and lower limits for the number of pods. We want to run minimum 1 pod and maximum it can scale up to 10 pods. Why? We cannot allow our application to consume all resources on the cluster.. right?
Generate Load
Now it is time to generate load and test
Open another terminal and login to the cluster. Make sure you are in the same project. And run the load generator pod from that terminal.
$ oc run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
~ $
This spins up a busybox
image from where we will generate the load.
Get the URL for your application oc get route hpademo --template=
, and use that in the following command inside the load generator at the prompt
while true; do wget -q -O- URL; done
You will start seeking a bunch of OK!
s as the load generator continuously hits the application.
Watch Scaling
In the other terminal, run oc get hpa hpademo -w
to watch how the load goes up. After a little while once the application scale up to a few pods, stop the load by pressing ^C
. And you can watch the application scaling down.
You can also see the number of pods go up on webconsole
NOTE Scale up takes a few mins and so does Scale down. So be patient.
$ oc get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpademo Deployment/hpademo <unknown>/50% 1 10 1 72s
hpademo Deployment/hpademo 0%/50% 1 10 1 75s
hpademo Deployment/hpademo 13%/50% 1 10 1 90s
hpademo Deployment/hpademo 36%/50% 1 10 1 2m15s
hpademo Deployment/hpademo 69%/50% 1 10 1 2m30s
hpademo Deployment/hpademo 68%/50% 1 10 2 2m45s
hpademo Deployment/hpademo 95%/50% 1 10 2 3m
hpademo Deployment/hpademo 94%/50% 1 10 2 3m15s
hpademo Deployment/hpademo 117%/50% 1 10 2 3m31s
hpademo Deployment/hpademo 124%/50% 1 10 3 3m46s
hpademo Deployment/hpademo 137%/50% 1 10 3 4m1s
hpademo Deployment/hpademo 145%/50% 1 10 3 4m16s
hpademo Deployment/hpademo 150%/50% 1 10 3 4m31s
hpademo Deployment/hpademo 143%/50% 1 10 3 4m46s
hpademo Deployment/hpademo 144%/50% 1 10 3 5m1s
hpademo Deployment/hpademo 143%/50% 1 10 3 5m16s
hpademo Deployment/hpademo 143%/50% 1 10 3 5m31s
hpademo Deployment/hpademo 149%/50% 1 10 3 5m46s
hpademo Deployment/hpademo 132%/50% 1 10 3 6m1s
hpademo Deployment/hpademo 120%/50% 1 10 3 6m16s
hpademo Deployment/hpademo 107%/50% 1 10 3 6m31s
hpademo Deployment/hpademo 87%/50% 1 10 3 6m47s
hpademo Deployment/hpademo 82%/50% 1 10 3 7m2s
hpademo Deployment/hpademo 53%/50% 1 10 3 7m17s
hpademo Deployment/hpademo 51%/50% 1 10 3 7m32s
hpademo Deployment/hpademo 29%/50% 1 10 3 7m47s
hpademo Deployment/hpademo 27%/50% 1 10 3 8m2s
hpademo Deployment/hpademo 10%/50% 1 10 3 8m17s
hpademo Deployment/hpademo 2%/50% 1 10 3 8m32s
hpademo Deployment/hpademo 1%/50% 1 10 3 8m47s
hpademo Deployment/hpademo 0%/50% 1 10 3 9m2s
hpademo Deployment/hpademo 0%/50% 1 10 3 12m
hpademo Deployment/hpademo 0%/50% 1 10 2 12m
hpademo Deployment/hpademo 0%/50% 1 10 2 13m
hpademo Deployment/hpademo 0%/50% 1 10 1 13m
Clean up
Once you are done with your testing run
oc delete all --all
to clean up all the artifacts and oc delete project workshop-scale-up-down-YourName
to delete the project
Summary
In this lab we have learnt to manually scale up and scale down, and idle the application. We have also learnt to use horizontal pod autoscaler to autoscale the application based on the workload.