7 Application Scaling

Introduction

In this exercise we will learn how to scale our application. OpenShift has the capability to scale your application based on the workload.

First we will deploy an application, scale it up and down manually. We will also learn Idling which is a unique concept in OpenShift.

Then we will learn to setup horizontal pod autocaler that automatically scales your application based on the workload and scales down when the workload reduces.

Deploy an Application to Scale

Create a new Project

oc new-project workshop-scale-up-down-YourName

Replace YourName with your name before creating the project.

Deploy an application*

We will deploy a simple PHP application that has index.php which does some computation. You can deploy using web console or CLI. Steps below show the deployment using CLI by running oc new-app --image-stream=php --code=https://github.com/RedHatWorkshops/hpademo as shown below

oc new-app --image-stream=php --code=https://github.com/RedHatWorkshops/hpademo

--> Found image 5303cd8 (3 months old) in image stream "openshift/php" under tag "7.3" for "php"

    Apache 2.4 with PHP 7.3
    -----------------------
    PHP 7.3 available as container is a base platform for building and running various PHP 7.3 applications and frameworks. PHP is an HTML-embedded scripting language. PHP attempts to make it easy for developers to write dynamically generated web pages. PHP also offers built-in database integration for several commercial and non-commercial database management systems, so writing a database-enabled webpage with PHP is fairly simple. The most common use of PHP coding is probably as a replacement for CGI scripts.

    Tags: builder, php, php73, rh-php73

    * The source repository appears to match: php
    * A source build using source code from https://github.com/RedHatWorkshops/hpademo will be created
      * The resulting image will be pushed to image stream tag "hpademo:latest"
      * Use 'oc start-build' to trigger a new build

--> Creating resources ...
    imagestream.image.openshift.io "hpademo" created
    buildconfig.build.openshift.io "hpademo" created
    deployment.apps "hpademo" created
    service "hpademo" created
--> Success
    Build scheduled, use 'oc logs -f bc/hpademo' to track its progress.
    Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
     'oc expose svc/hpademo'
    Run 'oc status' to view your app.

Running into Resource Limits

By default the build will fail with an Out of Memory (OOM) error. This is because of our default memory limits.

Make sure you are logged in to the OpenShift console, the instructions can be found in Chapter 2.

Run the following commands to fix the problem, please note that your results may differ.

## Select the project
$ oc project <Your Project>

## Now we need to find the builds
$ oc get builds
NAME        TYPE     FROM   STATUS    STARTED   DURATION
hpademo-1   Source   Git    Pending

## If the build is still running, cancel it first:
$ oc cancel-build hpademo-1

## Afterwards, we need to patch the BuildConfig, which is the name of the build without the "-1"
$ oc patch bc/hpademo --patch '{"spec":{"resources":{"limits":{"memory":"1Gi","cpu":"1000m"}}}}'

## Now, start a new build
$ oc start-build hpademo

## You can check it's status again by running oc get builds
$ oc get builds
NAME        TYPE     FROM          STATUS                       STARTED          DURATION
hpademo-1   Source   Git           Cancelled (CancelledBuild)   51 seconds ago   11s
hpademo-2   Source   Git@122d440   Running                      19 seconds ago

This will start an S2I build for the php application. You can run oc get builds and also watch the logs by running oc logs -f hpademo-1-build. You know drill by now!!

Create a route by exposing the service

$ oc expose svc hpademo
route.route.openshift.io/hpademo exposed

Run oc get route to get the URL for your application. Eventually your application gets built and deployed.

$ oc get route
NAME      HOST/PORT                                    PATH   SERVICES   PORT       TERMINATION   WILDCARD
hpademo   hpademo-$yourProject.apps.cluster.chp4.io    hpademo    8080-tcp                 None

If you curl the URL you will see that the index.php page does some computation and displays OK!

curl hpademo-$yourProject.apps.cluster.chp4.io
OK!

Scaling

Understanding Replicas Setting in Deployment Configuration vs Replication Controller

Check the deployment configuration for this application by running oc get deployment/hpademo -o yaml and focus on spec

...
spec:
  ...
  replicas: 1
  ...
...

You’ll notice that the replicas: is set to 1. This tells OpenShift that when this application deploys, make sure that there is 1 instance running.

Manual Scaling

To scale your application we will edit the deployment to 3.

Open your browser to the Topology page and note you only have one instance running. It shows when you hover over the deployment.

image

Now scale your application using the oc scale command (remembering to specify the dc)

$ oc scale --replicas=3 deployment/hpademo
deployment.apps/hpademo scaled

If you look at the web console and you will see that there are 3 instances running now

image

Note: You can also scale up and down from the web console by navigating to overview page and clicking twice on up arrow right next to the pod count circle to change replica count.

image

On the command line, see how many pods you are running now:

$ oc get pods
NAME               READY   STATUS      RESTARTS   AGE
hpademo-1-2cz8m    1/1     Running     0          8m24s
hpademo-1-7tcz6    1/1     Running     0          8m24s
hpademo-1-build    0/1     Completed   0          29m
hpademo-1-deploy   0/1     Completed   0          27m
hpademo-1-zl2ht    1/1     Running     0          27m

You now have 3 instances of hpademo-1 running (each with a different pod-id).

Idling

A related concept is application idling. OpenShift allows you to conserve resources by sleeping the application when not in use. When you try to use the application it will spin up the container automagically.

Idling the application

Run the following command to find the available endpoints

$ oc get endpoints
NAME      ENDPOINTS                                                        AGE
hpademo   10.128.2.37:8443,10.129.2.29:8443,10.130.2.28:8443 + 3 more...   37m

Note that the name of the endpoints is hpademo and there are three ip addresses for the three pods.

Run the oc idle endpoints/hpademo command to idle the application

$ oc idle endpoints/hpademo
The service "scaling-user1/hpademo" has been marked as idled
The service will unidle Deployment "scaling-user1/hpademo" to 3 replicas once it receives traffic
Deployment "scaling-user1/hpademo" has been idled

Go back to the web console. You will notice that the pods show up as idled.

image

At this point the application is idled, the pods are not running and no resources are being used by the application. This doesn’t mean that the application is deleted. The current state is just saved.. that’s all.

Reactivate your application

Now click on the application route URL or access the application via curl.

Note that it takes a little while for the application to respond. This is because pods are spinning up again. You can notice that in the web console.

In a little while the output comes up and your application would be up with 3 pods (based on your replica count).

So, as soon as the user accesses the application, it comes up!!!

Scaling Down

Scaling down is the same procedure as scaling up. Use the oc scale command on the hpademo application deployment setting.

$ oc scale --replicas=1 deployment/hpademo

deployment.apps/hpademo scaled

Alternately, you can go to project overview page and click on down arrow twice to remove 2 running pods.

Auto Scaling

Horizontal Pod AutoScaler (HPA) allows you to automatically scale your application based on the workload. It updates replicacount by watching the workload.

Set Resource Limits on your application

HPA requires your pods to have requests and limits set so that it knows when to scale the application based on the consumption of resources.

Let us update the deployment to set the resources by running oc set resources

$ oc set resources deployment hpademo --requests=cpu=200m --limits=cpu=500m
deployment.apps/hpademo resource requirements updated

We have set the CPU request (initial allocation) as 200 millicores and limit (maximum allocation) to 500 millicores. So when we ask HPA to scale based on percentage workload, it measures based on these numbers.

Set up HPA

Now we will create HPA by running oc autoscale command

$ oc autoscale deployment hpademo --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/hpademo autoscaled

Here we are did two things:

  • cpu-percent=50 indicates that when the CPU usage (based on requests and limits) reaches 50%, HPA should spin up additional pods

  • --min=1 --max=10 sets upper and lower limits for the number of pods. We want to run minimum 1 pod and maximum it can scale up to 10 pods. Why? We cannot allow our application to consume all resources on the cluster.. right?

Generate Load

Now it is time to generate load and test

Open another terminal and login to the cluster. Make sure you are in the same project. And run the load generator pod from that terminal.

$ oc run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
~ $

This spins up a busybox image from where we will generate the load.

Get the URL for your application oc get route hpademo --template=, and use that in the following command inside the load generator at the prompt

while true; do wget -q -O- URL; done

You will start seeking a bunch of OK! s as the load generator continuously hits the application.

Watch Scaling

In the other terminal, run oc get hpa hpademo -w to watch how the load goes up. After a little while once the application scale up to a few pods, stop the load by pressing ^C. And you can watch the application scaling down.

You can also see the number of pods go up on webconsole

image

NOTE Scale up takes a few mins and so does Scale down. So be patient.

$ oc get hpa -w
NAME      REFERENCE            TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
hpademo   Deployment/hpademo   <unknown>/50%   1         10        1          72s
hpademo   Deployment/hpademo   0%/50%          1         10        1          75s
hpademo   Deployment/hpademo   13%/50%         1         10        1          90s
hpademo   Deployment/hpademo   36%/50%         1         10        1          2m15s
hpademo   Deployment/hpademo   69%/50%         1         10        1          2m30s
hpademo   Deployment/hpademo   68%/50%         1         10        2          2m45s
hpademo   Deployment/hpademo   95%/50%         1         10        2          3m
hpademo   Deployment/hpademo   94%/50%         1         10        2          3m15s
hpademo   Deployment/hpademo   117%/50%        1         10        2          3m31s
hpademo   Deployment/hpademo   124%/50%        1         10        3          3m46s
hpademo   Deployment/hpademo   137%/50%        1         10        3          4m1s
hpademo   Deployment/hpademo   145%/50%        1         10        3          4m16s
hpademo   Deployment/hpademo   150%/50%        1         10        3          4m31s
hpademo   Deployment/hpademo   143%/50%        1         10        3          4m46s
hpademo   Deployment/hpademo   144%/50%        1         10        3          5m1s
hpademo   Deployment/hpademo   143%/50%        1         10        3          5m16s
hpademo   Deployment/hpademo   143%/50%        1         10        3          5m31s
hpademo   Deployment/hpademo   149%/50%        1         10        3          5m46s
hpademo   Deployment/hpademo   132%/50%        1         10        3          6m1s
hpademo   Deployment/hpademo   120%/50%        1         10        3          6m16s
hpademo   Deployment/hpademo   107%/50%        1         10        3          6m31s
hpademo   Deployment/hpademo   87%/50%         1         10        3          6m47s
hpademo   Deployment/hpademo   82%/50%         1         10        3          7m2s
hpademo   Deployment/hpademo   53%/50%         1         10        3          7m17s
hpademo   Deployment/hpademo   51%/50%         1         10        3          7m32s
hpademo   Deployment/hpademo   29%/50%         1         10        3          7m47s
hpademo   Deployment/hpademo   27%/50%         1         10        3          8m2s
hpademo   Deployment/hpademo   10%/50%         1         10        3          8m17s
hpademo   Deployment/hpademo   2%/50%          1         10        3          8m32s
hpademo   Deployment/hpademo   1%/50%          1         10        3          8m47s
hpademo   Deployment/hpademo   0%/50%          1         10        3          9m2s
hpademo   Deployment/hpademo   0%/50%          1         10        3          12m
hpademo   Deployment/hpademo   0%/50%          1         10        2          12m
hpademo   Deployment/hpademo   0%/50%          1         10        2          13m
hpademo   Deployment/hpademo   0%/50%          1         10        1          13m

Clean up

Once you are done with your testing run oc delete all --all to clean up all the artifacts and oc delete project workshop-scale-up-down-YourName to delete the project

Summary

In this lab we have learnt to manually scale up and scale down, and idle the application. We have also learnt to use horizontal pod autoscaler to autoscale the application based on the workload.