Let's get on each others' calendars.

Optimize Spending and Usage

With AWS Resource Utilization

In 2018 HubSpot published their study of over 1,000 marketing agency owners, finding that only 58% of teams were tracking their staff resource utilization rates. In other words, almost half of the surveyed teams had no idea whether their clients were profitable or not.

While resource utilization is traditionally used in terms of tracking employee time usage, leaders over at Amazon Web Services saw the potential in tracking how well time was being used by framing it as a resource. Since we, their customers, were buying resources from them, AWS needed to give us tools to see if the purchases were justified.

Nowadays, AWS resource utilization is an invaluable tool in your arsenal for cutting through the clutter and confusion. With a utilization-based mindset you can simplify the jungle that is AWS, optimize your operations, and use your findings to justify your spending to your CEO and finance teams.

In today’s post we’ll cover:

  • What is resource utilization?
  • How does resource utilization apply to AWS?
  • When is resource utilization important?
  • Key tools for calculating resource utilization
  • Tips for optimizing spending after resource utilization analysis

Let’s begin.

What is resource utilization?

A NASA image of the moon over a sunrise
Source, image used under license NASA's Earth Observatory by CC by 2.0

Put simply, resource utilization is a measure of how much of your available resources are being used.

This comes in many different forms. For NASA this manifests as “in situ resource utilization” (ISRU) - a practice of using the available local resources to reduce the mass and cost of space missions. This is achieved by harvesting resources such as oxygen (for propellant and breathing), iron and aluminum (for propellant) and hydrogen (for propellant or combining with oxygen to create water), thus eliminating the need to take those resources with the mission through blastoff.

Back on Earth, resource utilization is mostly used to refer to making the most of your employees through tracking their availability and productivity. For example, measuring sales team members’ billable hours and how much of them are spent on calls.

This can be incredibly useful for measuring your resource spend versus value generated and maximizing your ROI. It’s one of the easiest ways to do this without having to go too deeply into department-specific details - if you can effectively make sure that your resources are being utilized, you can focus your effort on optimizing or expanding your operations.

If you’re tracking your utilization you’ll be able to spot when resources are being paid for but not used and when you’re at maximum capacity and need to expand your resources in order to grow.

So, if you see that your sales team’s billable hours aren’t getting filled with calls (they aren’t getting fully utilized), you know that you need to increase marketing effort or somehow otherwise widen your top of funnel. If the team is fully booked and even working overtime, you know that it’s time to hire more sales reps to prevent team burnout and consistently handle demand.

The main issue with resource utilization is knowing what to measure and how to measure it.

You could just measure your sales team’s billable hours versus total call time but what happens when there are other duties that are vital but take time away from calls? How do you differentiate between “resources wasted” and “resources spent doing something else that’s equally valuable”?

Not to mention that you can measure the utilization of any of your resources. While employees are the go-to use case, there’s nothing stopping you from applying this practice to measure how effectively you’re using physical resources such as server space and even office supplies (if you want to really micromanage).

The trick is to identify the limits of what you’re tracking in a particular resource utilization measurement and how you can turn that into a concrete figure to track. Once you know your utilization rate it’s then a case of optimizing your expenses versus that utilization.

Speaking of server resource utilization, let’s jump into how to apply this practice to AWS.

How does resource utilization apply to AWS?

AWS resource management and tracking can be a nightmare for your engineering team, let alone upper management. How can you prove that what you’re spending is worth the investment to people who may have no idea how AWS works?

Well, much like other elements of AWS, the key to justifying your spending is to simplify your input and output to make it more consumable. That’s where resource utilization comes in.

Taking the resource utilization approach to your AWS account means that you can condense everything into a bitesize report that anyone will be able to understand. Whether you’re trying to get a sense of your costs versus usage for your own analysis or you’re explaining to your finance department or CEO why you need the budget you do, this is all made possible by making your AWS easy enough to tackle.

For example, Amazon CloudWatch is an extremely useful native AWS resource utilization tool (among others). This lets you monitor various container services, Amazon EC2 and RDS, and serverless services such as AWS Lambda and Amazon DynamoDB for things such as saturation, traffic spikes, errors, and latency.

In other words, CloudWatch lets you see whether your resources (be it network throughput, CPU utilization, read capacity or otherwise) are being utilized to their full potential, thus letting you plan for expansions or contractions in how much you’re paying for.

You can even create custom dashboards to track specific resources and set alarms for when your utilization goes beyond a certain level. You’ll never have to fly blind without knowing whether you’re using everything you’re paying for!

When is resource utilization important?

Resource utilization can play a massive part in improving your operations, but when is it most important to pay attention to it over, say, your company’s growth rate or the speed of feature development?

Here’s when it’s most important to track:

  • When cash is no longer “unlimited”
  • When a recession is looming
  • When you want to reduce waste
  • When you want to maximize ROI

When cash is no longer “unlimited”

Photo of person holding empty pocket pulled inside out from a pair of jeans.
Source, image used under Pixabay license

The first instance when AWS resource utilization is important is when resource utilization in general is useful - when your cash is no longer “unlimited”. That is, when your company is longer in a keep-spending-to-keep-growing trajectory.

This is when all teams will need to start cracking down on their spending, at least in terms of knowing that what they’re spending is worth it.

By the time this happens you should have the AWS infrastructure in place to deal with your workloads and handle the demands on your product. However, in trying to achieve that fulfillment you may have overcommitted to resources you don’t need.

That’s why resource utilization comes in handy - to show you which resources you’re paying for but aren’t actually getting used.

When a recession is looming

Sometimes the need to reduce spending is entirely due to outside circumstances. Economic downturns (a great big recession) aren’t pretty, and ignite the need to cut costs wherever possible in order to maintain a healthy business.

So, when things are looking economically rough, you’ll need to reduce costs. Resource utilization helps you here too by letting you cut the fat with confidence for the health of the company as a whole.

Alternatively it can protect your AWS resources from needless or harmful cost cuts by providing you with the proof you need to defend your spending. If all of your reserved resources are being used there’s no way to further crack down without severely compromising in some areas, and the rest of the company needs to realize these consequences.

When you want to reduce waste

A word cloud with the word Kaizen in the center and words related to the concept floating around it.
Source by, image used under license CC BY 2.0

You don’t have to wait for a full recession to reduce waste and improve performance. When it comes to AWS in particular (as we’ve already stated) trying to figure out whether you need everything you’re paying for can be a headache, so having an easy-to-see report or dashboard to judge your resource utilization from can be a godsend.

Reducing waste is an incredibly powerful practice - you just look at the success of kaizen to see how many companies have supercharged their operations by focusing on their waste.

When you want to maximize ROI

Much like waste reduction, resource utilization can also play a huge role in maximizing the ROI of your resources.

By tracking your AWS costs and usage you generate the data needed to calculate whether you’re actually getting a good return on your investment, be that an investment of time or money.

Just remember that in order to do this you need to know what exactly you’re tracking and how you’re going to measure it.

Key tools for calculating resource utilization

Illustration of a computer screen showing a gear symbol. Floating above the screen is a person's hand reaching down with a wrench as if to adjust the gear on the screen.
Source, image used under Pixabay license

While they can be a little confusing at first, Amazon has some native tools to help you monitor your resources and judge whether or not everything is being utilized effectively. These are:

  • AWS CloudWatch
  • AWS CloudTrail
  • DataDog

AWS CloudWatch should be your first stop for resource usage-related data. In Amazon’s own words, it provides “automatic dashboards, data with one-second granularity, and up to 15 months of metrics storage and retention”, which is a great starting point for getting a sense of the state of your resources.

AWS CloudTrail can be used to track events across your entire account, making it more suited to giving context to your resource usage. By presenting data such as audit reports and user identities, you’ll be able to see who is using your system at what times and for how long, thus tightening security and showing whether the people using your resources are the ones who need them.

As for third-party tools (other than Aimably which we covered above), DataDog is a great monitoring and analytics tool for keeping an eye on your cloud services and tracking key performance metrics. It also works with Azure and Google Cloud Platform if those are more your speed.

Ultimately, however, it doesn’t so much matter where you’re getting the data from, so long as it’s accurate. It’s also difficult to recommend a tool for every situation and resource utilization metric, as these entirely depend on how you define the resources you want to be tracking.

The main thing is to make sure that your data is always correct and to carefully analyze any results before making assumptions as to their meaning.

Once you’ve figured out how your resources are being utilized the next step is to work out how you can optimize your spending to match that utilization.

Tips for optimizing spending after resource utilization analysis

A photograph of two pairs of women working together in an office. On the desk are a variety of computer equipment, papers, pens and beverages.
Source, image in the public domain

Before we round out with our recommendations for optimizing your spending, it’s worth noting that the tweaks you make will (or at least should) depend on whether your utilization is fixed or varying. That is, whether the rate of utilization for a given resource is relatively stable or widely fluctuating.

If your utilization is fixed you’ll have an easier time optimizing, since you’re not having to account for the ongoing changes. As such, you should start by working out what the best server size and type for the load is and adjusting your configuration, if necessary. Then if you don’t need to use that particular resource all the time (e.g. it doesn’t serve production customers) you can simply schedule it to come online and go offline when you need it.

Aimably Reduce is a fantastic way to do this, as it not only helps you to identify areas where you can save money, but also when would be best for your servers should come online to match demand. You can even set up precise work schedules from within Reduce to save you the trouble of setting them up manually across your various servers.

If your utilization rates for a resource vary then you’d think that the only way to optimize would be to make sure you have enough resources to cover your highest load peaks and no more, but that results in a lot of wasted spending. What about all of the time that your system isn’t running with high demand?

No, instead it’s better to build autoscaling groups so that you can use multiple smaller resources that scale up to meet your highest peaks. That way you have the flexibility to accommodate high demand but you’re not constantly paying for a system running at full capacity when in reality you need a lot less to function.

Another pitfall to avoid is not bothering with resource utilization because you’re already getting your Cost and Usage Reports. “They’ll show us everything we need to know, right?”, I hear you cry.

Yes… and no.

A screenshot of an AWS cost and usage report loaded into MS Excel

While CURs do show you your cost and usage data in a way that would otherwise be impossible to draw from, by themselves the reports are unintelligible if you’re not already experienced with them. This makes analysis and insight impossible, and there are no other ways to natively get the information you need to analyze your resource usage.

Also, CURs show resource cost and usage but not resource utilization. They might sound the same, but there’s a core difference in the context of what’s being focused on and what it’s being measured against.

Resource utilization isn’t an exact science and much of the practice comes down to correctly identifying what resources you need to track, figuring out how to measure their utilization, and then optimizing and iterating based on your findings. This means that there’s only one way to know what the best options are for your own setup.

Get out there and get testing!

AWS Total Cost of Ownership