You will demonstrate how to scale your pal-tracker
application
running on Tanzu Application Service.
After completing the lab, you will be able to:
Review the Scaling slides.
You must have completed (or fast-forwarded to) the
Availability lab.
You must have your pal-tracker
application associated with the
scaling-availability-solution
codebase deployed and running on
Tanzu Application Service.
In a terminal window,
make sure you start in the ~/workspace/pal-tracker
directory.
In this lab you will exercise your pal-tracker
application under load,
monitor it,
and tune it.
You can monitor the pal-tracker
application through the following:
Command line via the following cf
commands:
cf app pal-tracker
cf events pal-tracker
Apps Manager user interface.
If you choose to monitor via the command line you will need a minimum of four terminal windows open.
If you choose to monitor with Apps Manager you will need only one.
pal-tracker
Tanzu Application Service supports scaling the number of application instances in 3 ways:
cf scale -i <number of instances>
command.instances
parameter,
and pushing it.You have already used the second option to achieve better availability characteristics of your application.
Scaling up the number of pal-tracker
instances requires two
characteristics of the ` application:
pal-tracker
application instance cannot persist state within its container.pal-tracker
application supports a
concurrent scale out model,
meaning that it can run multiple application instances concurrently.Now you will use the autoscaler to accommodate increased workload
for the pal-tracker
application.
Pretend that you have been running your pal-tracker
application in
production for a while,
and you have good insights into the runtime characteristics:
You know from experience you can run 10 requests-per-second (rps)
comfortably on a given pal-tracker
application instance.
You have stress tested your application, and you know the maximum work rate per instance when it may become unstable is 20 rps.
The current pal-tracker
application has a relatively consistent
workload of 10 rps throughout the day.
You forecast in the next release you will have occasional daily
peaks where the pal-tracker
application may have to handle between
40 and 50 rps.
How many instances will you need to run at peak periods,
without factoring in availability?
(target rps) / (rps/instance) = number of instances
Or
(50 rps) / (10 rps/instance) = 5 instances
Factoring in the need for availability, you have learned in production that under normal conditions you have sufficient redundancy with 2 extra instances. So, you never want to run fewer than 3 instances.
You now know based on your stress testing that planned and/or unplanned outage of individual instances will be sufficiently tolerated with a total of 5 instances at 50 rps. You can see this by considering that even if 2 of the 5 instances become unavailable, the overall throughput would be:
(maximum rps/instance) * (number of instances) = max throughput
or
(20 rps) * (3 instances) = 60 rps
This still greater than the maximum required throughput of 50 rps.
Tanzu Application Service supports automatic horizontal scaling based on either pre-defined or custom rules.
For request/response (blocking) web applications, HTTP throughput is a good choice assuming you have a solid grasp of the performance, stability, and scaling characteristics of your app.
If you are running the labs on your own development machine, you will need to install the Tanzu Application Service Autoscaler CLI plugin.
You have been supplied with a set up script that will configure an autoscaling rule for you with the following characteristics:
You can review the set up script to see how the autoscaler CLI works:
git show scaling-availability-start:scripts/setup-auto-scaling.sh
Run the setup to enable the autoscaler:
./scripts/setup-auto-scaling.sh
Run the following autoscaler watch command:
watch cf autoscaling-events pal-tracker
From a separate terminal window, run a load test:
Note that the NUMBER_USERS
is now set for 100 users,
and REQUEST_PER_SECOND
is set for 50 rps.
It is critical to run with these new settings instead of the previous load test runs, otherwise the autoscaler will not work in line with the expectations of this lab.
docker run -i -t --rm -e DURATION=300 -e NUM_USERS=100 -e REQUESTS_PER_SECOND=50 -e URL=http://pal-tracker-${UNIQUE_IDENTIFIER}.${DOMAIN} pivotaleducation/loadtest
Observe both the pal-tracker watch and autoscaler watch terminal windows.
How long does it take before the autoscaler scales up to the 5 instance limit?
Let your load test complete,
or terminate it by Ctrl+C
.
Observe both the pal-tracker watch and autoscaler watch terminal windows.
How long does it take before the autoscaler scales down to the minimum 3 instances?
Turn off the autoscaler:
./scripts/turn-off-autoscaling.sh
Terminate the cf app
and cf events
watch windows.
You saw that the autoscaling behavior is not instantaneous. It is designed conservatively using a concept of a Governor, an algorithm that limits rate of change within the autoscaler to prevent potential outages if it is not tuned or used correctly.
The Scenario assumed that you have significant knowledge about the performance, stability, scaling and capacity usage of your application. The scenario in this lab is actually quite naive, it is up to you to gain familiarity with your production application characteristics using your observability tools, and also to do empirical testing to verify behaviors that you anticipate to encounter in production.
If you do not have this background and knowledge, do not use the autoscaler!.
See the Reddit outage postmortem announcement related to autoscaling.
It is a sobering read that should give you pause when choosing to run an autoscaler, as well as operate it.
Another good read on the subject is Release It! Second Edition, Chapter 4 - Stability Antipatterns → Force Multiplier and Chapter 5 - Stability Patterns → Governor
A specific limitation for the Tanzu Application Service autoscaler is that the CPU rules are not reliable. See this advisory for more information.
Now that you have completed the lab, you should be able to: