Let's get on each others' calendars.

How AWS Auto Scaling Works

(It’s a Thermostat)

When heating prices rose by 28% we all felt the squeeze. We had to re-evaluate how much we were willing (or able) to spend to keep our homes at a set, comfortable temperature. It’s a balancing act between being warm and comfortable, and not spending a huge chunk of your wages to be like that.

Autoscaling with AWS is similar, in that AWS can maintain a specific resource application performance by automatically scaling the resources in your environment according to present or historical traffic and usage data.

It’s basically the thermostat of your AWS account.

That’s why we’re going to teach you all about it today! We’ve split this post up into the following sections to make it easier to read:

  • What is Auto Scaling in AWS?
  • What is AWS Application Auto Scaling?
  • Why AWS Auto Scaling is important
  • How to use AWS Auto Scaling
  • Common concerns with AWS Auto Scaling
  • AWS Auto Scaling; balancing availability, reliability, and cost

Let’s get started (and don’t touch the thermostat)!

What is Auto Scaling in AWS?

To scale in a traditional data center you’d need to literally expand your operations. You’d buy the servers and equipment required to meet your maximum possible capacity and performance, then expand further when you’d start to see performance degrade. This would leave you with a setup that could potentially handle your entire operations, but more often would be left sitting at lower capacity while you still pay for the full setup.

Let’s face it - nobody wants to pay for things that they don’t have to use. Cloud computing is no different in this capacity, but with the ability to provision resources instantly without purchasing real hardware, the need to account for sudden (sometimes massive) traffic spikes and inconsistent application loads means you have a lot more options.

But unlike the natural delay in data centers between seeing a traffic spike and your ability to resolve the issue, now you’re faced with a difficult question. Do you pay more to cover your maximum possible needs immediately, or do you operate at the average level of use and take the hit to performance when things get busy?

The tools built for AWS Auto Scaling fix that issue.

AWS Auto Scaling is programmatically driven horizontal scaling of resources to handle identical workloads. Stated another way, it’s the act of opening up another register at the supermarket when the checkout lines get long, and closing it up again once one or more of the cashiers get idle, all without any managers having to make any decisions.

What is AWS Application Auto Scaling?

Source, image in the public domain

AWS Application Auto Scaling is the primary tool AWS offers that allows you to set parameters for your other AWS products to automatically scale their level of available resources. This can be a massive benefit in terms of removing the need to manually manage your resources to account for activity spikes while not wasting money by operating at maximum capacity at all times.

Whether you suddenly need to run more EC2 instances to balance the workload without slowing down your systems or provision extra capacity with Amazon DynamoDB, everything can be set to automatically scale based on your current needs. That way your systems operate only at the level they’re required to in order to maintain good performance, and you don’t have to pay extra for wasted resources.

Yet AWS Application Auto Scaling is not your only tool for programmatic horizontal scaling in an AWS configuration. Other products, such as Elastic Beanstalk, Batch, and others, offer auto scaling embedded within their service controls.

If you’re utilizing AWS products and aren’t using AWS Auto Scaling, you’re probably losing money.

Why AWS Auto Scaling is important

As stated above, AWS Auto Scaling is a vital element in your cloud computing toolbelt. Without it you’ll be left manually adjusting the resources of your various AWS products manually, which is almost a full-time job by itself. That’s to say nothing of the massive unnecessary expenditure you’ll incur if you don’t succeed in perfectly accommodating demand, as you’ll be either provisioning resources that aren’t currently needed to try and accommodate potential spikes or your setup’s performance will suffer as a result of being overly strained.

Source by CPIHR Connect, image used under license CC BY 2.0

To put it simply, there are four main benefits to using AWS Auto Scaling:

  • Fast, centralized setup
  • Automate horizontal resource scaling
  • Maintain max performance
  • Only pay for what you need

The first advantage is something you might not associate with AWS as a whole, but AWS Auto Scaling is very easy to set up through the AWS Management Console. It provides a single central interface from which you can view the average utilization of all of your compatible AWS products, meaning that you don’t have to constantly flit back and forth between them in order to get a sense of what you need.

The act of automating your resource scaling (horizontally, at least) is the main reason that this service exists, and thank the stars above that it does. While you might be able to cobble together a workaround or utilize a third-party tool to do the same thing if AWS Auto Scaling wasn’t available, being able to set your parameters for the service to follow and then practically forget about having to scale your services aside from outright expansion and optimization is a huge relief.

Think of it like trying to maximize your bills and comfort at home by regulating the temperature in your house. You don’t want to pay for the heat you don’t need, but you also don’t want to sit there shivering just to save a few dollars per month. The equivalent of Auto Scaling here would be your thermostat - it monitors the temperature of your home and turns your heating on or off to maintain a comfortable temperature without wasting money. You don’t even have to think about it!

AWS, however, is infinitely more complicated than how hot your house is. That’s why it’s so useful that AWS Auto Scaling not only offloads the manual element of managing your resources, but makes said management infinitely more efficient in maintaining performance.

Simply put, you will never be able to scale your resources manually and hit the perfect balance between cost and performance at all times. By automating it you’re allowing a system that can monitor, act, and react to changes much faster than any human can, meaning that your AWS products’ performance will be as good as possible no matter what demands are put on your systems.

Source, image in the public domain

The final (main) benefit of AWS Auto Scaling is that of reducing costs. Much like your thermostat, having the system run beyond what you require (eg, having the heating on in the summer) will become expensive very quickly. This tool prevents that by scaling your systems according to the actual demand placed on them, rather than a vague prediction of what your maximum load might be.

As long as you set the parameters carefully, you’ll never be spending money that you aren’t getting an equal or greater amount of value from in the form of your system’s performance and capacity to deal with traffic.

How to use AWS Application Auto Scaling

To learn how to use AWS Application Auto Scaling you need to know which AWS products are compatible with it, what it allows you to do in those products, and get to grips with two core concepts; scaling plans, and scaling strategies. Once we’ve covered all of that, we’ll round up with a discussion of your options in terms of dynamic and predictive scaling.

AWS products compatible with AWS Application Auto Scaling

You’ll first need to know which AWS products are compatible with AWS Application Auto Scaling. The full list can be found in AWS Application Auto Scaling documentation, however some of these are what you can automate the scaling of with this service:

  • Amazon EC2
  • Amazon EC2 Spot Fleets
  • Amazon ECS
  • Amazon DynamoDB
  • Amazon Aurora

To expand on this slightly, you can launch or terminate EC2 instances that are part of an EC2 Auto Scaling group, and/or instances from an EC2 Spot Fleet. You can also automatically replace instances in a Spot Fleet which are interrupted for price or capacity reasons.

Amazon ECS’ service desired count can be automatically adjusted to respond directly to load variations, and you can enable a DynamoDB table or global secondary index to scale its read and write capacity to deal with traffic increases without throttling occurring.

Finally, AWS Application Auto Scaling enables you to automatically tweak the number of Aurora Read Replicas provisioned for an Aurora DB cluster to use. Thus, your setup will be able to handle any kind of sudden increases or decreases in active connections or workload without wasting a chunk of your money.

Automatically scaling any other AWS products isn’t currently possible with Auto Scaling, however you may still be able to do a similar thing from within the apps themselves or via using a third-party tool.

Scaling plans

A “scaling plan” is the name for the set of rules you lay out in Auto Scaling from the AWS Management Console. The plan is the set of rules rather than what those rules actually are, meaning that you can have separate scaling plans for different resources. You’ll need to add tags to your resources to be able to scale them separately (or utilize AWS CloudFormation), but that’s the gist of it.

Scaling strategies

Source, image used under Pexels license

Scaling strategies are the rules that tell Auto Scaling how to scale your resources, and what the parameters of that scaling should be. Think of them as the instructions that make up a manual - the scaling plan is the manual as a whole (and you can have separate manuals for different things), whereas the scaling strategies are the individual instructions within.

When choosing a scaling strategy, you’ll be given four options:

  • Optimize for availability
  • Balance availability and cost
  • Optimize for cost
  • Custom

The first three strategies are all based on keeping the average CPU utilization of your Auto Scaling groups at a certain threshold. Optimizing for availability will keep CPU utilization at an average of 40% to ensure high availability and capacity to meet spiking demand, optimizing for cost will keep utilization at 70% on average to reduce costs but maintain some wiggle room for spikes. Balancing cost and availability will have AWS Auto Scaling maintain an average 50% CPU utilization as a middle ground - you have better availability and performance than if you focused on cost, but equally you’ll spend less than if you went all-out on availability.

There’s not much to complain about with these three strategies - they’re self-explanatory, easy to understand, do what they say they’ll do, and make it incredibly easy to set up a basic automatic scaling plan.

However, you can also decide to create your own strategy (the “Custom” option). This involves selecting your own scaling metric, target value, and so on. For example, you could set something other than CPU utilization as your metric to control when your system automatically scales, or simply tweak the percentage target that the system is trying to reach via scaling.

Dynamic and predictive scaling

No matter what scaling strategy you choose (or create), you’ll also have to decide whether to use two features called dynamic scaling and predictive scaling.

Dynamic scaling is mostly what we’ve referred to throughout this article; it’s the act of scaling based on CloudWatch reports, thus reacting from live information of what your resource utilization is compared to your target. This will result in more unpredictable costs, since these will directly depend on and react to your utilization and demand, but it will generally be more reliable in providing for your actual needs without over-or-under-provisioning.

Predictive scaling is the other route you can take, although note that it’s currently only available for AWS EC2 Auto Scaling groups. Instead of reacting to live reports, predictive scaling looks at the average trends in traffic from anywhere between the last 24 hours and 14 days (you can adjust this as you wish). From this data, AWS Auto Scaling will predict what the upcoming 2 days of traffic will look like, then use those to set your needs according to your scaling plan and strategy for that resource.

You’re basically choosing whether you want to prioritize consistency in your reliability or costs. Dynamic scaling will more accurately take your current needs into account and will thus be able to scale and meet your needs more reliably. Predictive scaling may run into some issues if you experience a particularly high traffic spike or waste some money if there is an unusual traffic dip, however your bills will be much more predictable because you’ll know what resources you’ll be paying for at least two days in advance. More importantly, your bills won’t change wildly depending on your current needs - they will be based on a much more stable average figure.

Common concerns with AWS Auto Scaling

Let’s run through some common concerns before we wrap things up.

The first is that using AWS Auto Scaling will cause issues in your architecture, as your application is built so that every server is unique and handles a specific job. In theory this means that, for auto scaling to work, you’d need to rewrite your application code so that it can work with any server and so that traffic can be diverted between servers in a group.

This is an issue, and there’s no doubt that rewriting your application’s code is a massive undertaking that’s probably out of your scope. In this case it helps to only rewrite that logic for containers and move to ECS.

The second concern comes after you’ve tried auto scaling, but you’ve had performance issues because resources didn’t scale fast enough when there was high demand. This is because the default options for auto scaling are a little limited and based on your performance averages, so having sudden large spikes can mean the default settings don’t scale up fast enough.

You have two ways to deal with this. If your traffic fluctuations are cyclical and somewhat predictable, try using predictive auto scaling, as this will account for previous large spikes. If they’re sudden and unexpected, you’ll have to configure very specific CloudWatch metric triggers for scheduled ratio scaling based on immediate performance changes instead of using the average ones.

Finally, another common concern springs up when you have a team monitoring performance. Even if they’re doing a great job, there’s a worry that the auto scaling will take over much of their current hand-on duties and leave them with not enough to fill their days with (this is a common hangover from the data center days).

The solution here is to remember that there’s always something valuable that can be done instead. Not only can you move those team members to other tasks which can’t be handled by an automated process, but you’ll be saving money on the act of scaling operations by avoiding human error and delays.

We understand that there are always going to be worries about whether using AWS Auto Scaling is the right thing for your team. However, the truth is that most of those concerns can be easily solved or completely nullified with a little careful thinking and some preparation or experimentation.

AWS Auto Scaling; balancing availability, reliability, and cost


Now that you know the basics of AWS Auto Scaling, you know that it’s a powerful tool to help automate a frustrating element of resource provisioning, availability, and utilization. The main difficulty with using it (other than every use case being so unique that it’s hard to produce any general setup advice) is that you still have to make the decision of what to prioritize; availability, reliability, or cost.

However, we here at Aimably don’t believe that cost should be a negative factor when it comes to your AWS account. That’s why we can help you to consolidate and reduce your AWS bills with next to no effort on your part - you’ll practically automate your account analysis!

Speaking of which, we even have an AWS Cost Reduction Assessment that can help you out if you’re struggling to justify your current spending. Are you considering implementing AWS Auto Scaling? We’ll analyze your account in the context of your business and make actionable suggestions for how you can immediately reduce your bill without affecting performance and will calculate the exact savings you’ll receive from doing so. There’s never been an easier way to trim the fat from your cloud computing setup.

What are you waiting for? Click here to get started today!

AWS Total Cost of Ownership