If your business is doing well, your applications need to be able to meet the demands of high traffic without falling apart. You’re going to need to scale your infrastructure.
In this article we’ll be focusing particularly on AWS as our use case for what we’re scaling up, but the principles here can be applied to almost any business, as they dominate the public cloud market. As with all cloud vendors, AWS can handle both vertical and horizontal scaling. While vertical scaling (simply expanding the capacity of your existing infrastructure) is pretty easy, horizontal scaling (adding more services or servers) unlocks more options. Horizontal scaling also contains plenty of potholes that you need to be wary of.
That’s why today’s post will cover:
- What is horizontal scaling?
- Pros and cons of horizontal scaling
- What is auto-scaling?
- Scaling out vs scaling up
- How to horizontally scale your AWS-powered application
Let’s get started.
What is horizontal scaling?
There will come a time when your cloud architecture is no longer sufficient to deal with your demands. Whether this is due to a greater number of data requests as your product grows in popularity or as time goes by and your database expands in size, it’s a natural part of your architecture’s lifecycle.
When this happens you have two options. Either you reduce the demand so that your current setup is still sufficient, or you expand your setup. The first option is wildly expensive because it would require rewriting your source code fundamentally, so most companies focus on scaling infrastructure as demand increases.
Horizontal scaling describes purchasing more machines to expand your cloud setup, as opposed to vertical scaling which involves expanding the capacity of your current machines. Think of horizontal scaling as expanding out whereas vertical scaling expands up.
We’ll start off with a more traditional example to really make the difference in scaling types clear.
You’re a logistics company with 20 small trucks. You know that you’ll need higher capacity to take on bigger (and more profitable) orders, so you have a few options. You could upgrade your 20 trucks to larger models, letting each individually store more and thus carry bigger loads (vertically scale), but for roughly the same price you could purchase another 20 trucks of the same size and double your storage capacity and availability for different orders (horizontally scale), while also reducing the overhead of truck repairs.
Now let’s say that you’re running an application server on an m5g.medium EC2 instance. It’s starting to run more slowly and you’re going to be out of processing capacity soon. One option is to vertically scale by upgrading your instance to the more expensive m5g.large. The other is to purchase three m5g.small instances, expanding your setup with multiple machines to handle the growing workload.
This could have been at the point where you had no option other than expanding to meet demand (you’re simply running out of storage or processing power), or this could have been a pre-emptive measure that was taken to facilitate the business plan being executed more swiftly.
That’s all there is for the basics of horizontal scaling!
Every company’s situation is different, and so there are no hard-and-fast rules to follow that show that horizontal is better than vertical scaling or vice versa. We’ll dive into that more fully a little later but, for now, let’s look deeper into the pros and cons of horizontal scaling.
Pros and cons of horizontal scaling
To make this a little easier to understand we’ve separated this into two sections; first we’ll cover the advantages of horizontal scaling and then the disadvantages.
Advantages of horizontal scaling
The primary advantages of horizontal scaling are that:
- It brings better performance
- It lessens the load on individual machines
- There’s a lower chance of downtime
- There’s no upper scaling cap
- You’ll be more fault resistant
First, you’ll benefit from better performance by horizontally scaling. Compared to your original setup you’ll have a greater capacity to deal with data and requests, and compared to vertical scaling you’re spreading the load over more machines, meaning each has to deal with fewer requests individually.
By lessening the load on individual machines you’re allowing your existing setup to work even better than before, as opposed to upgrading the existing machine and then forcing it to deal with even more requests.
In a similar way, having your workloads spread over numerous small machines rather than one extra large one means that there’s a much lower risk of downtime for your system. One machine needing to go down for maintenance in a vertical setup means that your entire system goes offline. One in a setup of 20 results in a mere 5% being taken offline for the duration of maintenance.
Plus, that’s not even mentioning the possibility of shifting the requests onto your other (much less crowded) machines in the meantime. By having multiple machines all containing copies of your source code, you have the flexibility to divert requests in the event that one of them goes offline. Your system won’t break due to an outage - it’ll get slower due to the increased load on your other machines, but it’s not the same catastrophic outcome as being unable to handle those requests in the meantime.
Then there are the limits of vertical scaling.
Vertical scaling has a built-in hardware cap, as there’s only so much you can upgrade and expand your setup without bringing in new machines. Once you have the biggest instances in place you literally can’t upgrade further without buying new, completely separate instances.
Horizontal scaling is much more flexible in that the only reasonable limit is your own budget. You can keep purchasing new machines for as long as there’s the capacity to buy, and Amazon is not going to run out of machines available for purchase.
However, this doesn’t mean that horizontal scaling is the best solution for every situation…
Disadvantages of horizontal scaling
The main disadvantages of horizontal scaling are that:
- It can be more expensive than vertical scaling
- Multiple machines can be more difficult to manage
- It takes more time and complexity to set up
Typically, AWS pricing makes the same processing power cost the same, no matter if it comes in the form of one very large server or two servers half the size. You might think that this means horizontal and vertical scaling cost equivalent, but you’re going to need more infrastructure to handle the side-by-side machines, and you’re going to need to factor in the time and money it costs to set them up and maintain it all. As such this tends to be the biggest limitation on how much you can horizontally expand - you’ll run out of money before running out of potential machines to buy.
This means that horizontal scaling isn’t always a good idea for companies that are focusing more on their profitability, whether that’s due to needing to improve their performance versus the rule of 40 or just in an attempt to achieve profitable growth. If costs are a primary concern for you, it might be better to consider vertically scaling (if you have to scale your operation at all).
Speaking of maintenance, managing your setup after scaling horizontally is far more difficult than doing so with a vertical setup, provided you don’t also implement any automated tools.
Think about it. With vertical scaling you can keep a fixed number of machines and only have to manage their system profiles when you need more capacity. You can easily handle this in the console yourself, without requiring additional staff.
Horizontal scaling means you need to track utilization on each of your machines in order to know when to add another to the mix. It requires more focus, and potentially an additional member of your staff. This will inflate your costs further, and it’s easy to forget that when assessing how you want to scale - it’s not a direct result of expansion, so it can take you by surprise.
Furthermore, horizontal scaling takes more time to set up than vertical scaling.
When you vertically scale your operation all you have to do is make a change in your management console.
With horizontal scaling, you need to do some heavy lifting. First, you need to ensure that your application can deliver traffic to more than one server, which likely means you’re going to need to add a load balancer in front of your new server pool, or some other mechanism of diverting workloads. Next, your application will need to be able to manage simultaneous workloads across multiple systems of record or processing entities. Further, when implementing horizontal scaling across databases, you may need to separate your reading functions from your writing functions, so that your system can scale up read replicas only. Finally, you’ll need to ensure that any new instance brought online in your horizontal pool is an exact replica of the others. This can involve a manual configuration checklist, or the configuration of a server image that can be copied to each new instance.
Even if you just copy your basic application server and then set it up to handle different types of requests from your other system components, it will still require much more time and testing before going live. This means that you’ll be left without the benefits of your new setup for longer, which can cost you even more money in the lost productivity it represents.
What is auto-scaling?
When it comes time to scale up to meet heavier demand, it’s worth noting that most applications don’t encounter consistently higher demand. There may be times of hour, day, week, or month when demand increases, followed by a period of lower usage. As a result, once you’ve chosen to scale up, either horizontally or vertically, there’s a significant amount of time where you’re paying for more processing capability than your users actually require.
Auto-scaling is a horizontal scaling technique that automatically expands and contracts the number of servers in a pool in response to the current demand on the system. Auto-scaling is a major benefit of operating your infrastructure on the virtualized infrastructure of a public cloud, like AWS, because you don’t need to purchase or implement actual physical servers. Instead, when you have implemented an auto-scaling technique, a service will add and subtract additional virtual servers automatically.
Let’s go back to the logistics company example.
You currently have 20 trucks, but it’s October and you know the holiday season is coming. You can anticipate that the number of shipping jobs will increase dramatically, and if you purchased 20 more trucks, you’d be able to meet that demand. But, in January, you’d be left with 20 unused trucks sitting on your lot as demand decreases back to the normal amount, and you’ll still be making payments for their purchase.
This is a great way to lose a lot of money.
Instead, you can set up a contract with a truck rental agency based on your realtime truck capacity. When your current trucks exceed their capacity, the rental agency automatically sends new vehicles to your lot. Similarly, when these rental trucks sit unused for a period of time, the agency comes out to pick them up. All of a sudden, you can meet the holiday demand without having to pay for the cost of additional shipping capacity all year long. That’s auto-scaling.
Scaling out vs scaling up
The ultimate questions you need to ask when considering whether to scale horizontally (out) or vertically (up) is this:
Is now the right time to re-architect?
The flexibility and reliability offered by expanding your operation with more machines mean that horizontal scaling is an incredibly powerful tool in your arsenal. It lets you increase your system’s capacity and capabilities without forcing downtime for upgrades. The new machines can act as natural fault guards to shoulder the weight of any machines that fail or have to go offline for maintenance.
Plus, when implemented with auto-scaling, horizontal scaling is clearly the most cost effective choice. You’ll never pay for capacity or processing power that you do not need.
However, as stated above, this comes at a very literal price.
Migrating from a single-server architecture to one with horizontal scaling can be prohibitively expensive, especially when you factor in the opportunity cost of using your same resources to build customer-facing features and functionality.
Realistically, it’s never a good time to take your eye off of customer-focus, but making a decision to scale horizontally will be required when you exceed the capacity of the largest instance sizes available. It’s important to avoid backing yourself into a corner and having to scale horizontally at the last minute. Figure out when is the right time to update your architecture to support horizontal scaling, and plan ahead.
How to horizontally scale your AWS-powered application
Unlike much of AWS, it’s rather simple to set up horizontal scaling in your AWS infrastructure because there are a variety of built-in services that support auto-scaling across the AWS product suite. It all started with the AWS Auto Scaling product, but now includes native features in a wide variety of services, from Aurora to Elastic Beanstalk.
AWS’ Auto Scaling systems allow you to set parameters for the capacity and performance of your setup, which it will then adjust automatically based on your cost preferences. You can also state separate parameters for different products, such as EC2 instances or Dynamo DB indexes and tables.
While you still have to build out the logic and infrastructure to move from a single instance to a horizontal pool, these native AWS services ensure you’re never paying too much for your scale. You can simply tell AWS Auto Scaling what you wish to prioritize (cost or performance) and let it scale accordingly.
However, if you really want to make the most of your horizontal scaling efforts then you need to come to us here at Aimably.
Our AWS Cost Reduction Assessment will help you to know exactly how much you can plan to save with an auto-scaling approach instead of current vertically scaled configurations, ensuring you’re embarking on a horizontal scale journey at the right time. It’s a great way to be confident you’re investing the right effort towards the greatest impact.
If you’re really struggling to know whether (or how) to plan your expansion, particularly in regard to your spending, you can also talk to us through our AWS Financial Operations Services. We’re here to help you make sense of the incomprehensible behemoth that AWS can sometimes turn into.
The key to all of this is to know when you need to scale, when you’ll benefit most from doing so, and which scaling method would best serve your needs. Thankfully, that’s exactly what you’ve learned today!