Scalability in Cloud Computing & Why We Love AWS

In this post we’ll take a look at one of the key benefits of cloud computing: scalability. We’ll explore the different scaling options available for your cloud-based workloads, and then we’ll take a look at specific services in AWS that can help you to achieve scalability for your AWS-hosted applications.  

Hand arranging wooden blocks as a rising staircase. Indicating scalability in cloud computing.

Whether you’re starting out with a single instance and hoping to grow, or you have a globally distributed application, you always need to be factoring scalability into your application architecture.

Most businesses are looking to grow, so you need to make sure that your application infrastructure can grow in line with customer demand. But, not only grow—you may also want to be able to shrink resources again during quieter periods to keep infrastructure costs in line with your usage patterns.

Cloud Computing Scalability

Scalability is one of the key benefits of cloud computing. Having access to seemingly limitless resources does to some extent take away the headache of how to scale your application infrastructure in line with demand.

However, you need to ensure that your application is designed to leverage the cloud infrastructure in the most efficient way possible in order to allow your infrastructure to grow and shrink along with your business requirements.

Let’s first take a look at the different ways in which a cloud-hosted application can scale: vertically, horizontally or diagonally.

Vertical Scaling

thick arrow pointing up

Vertical scaling, also known as ‘scaling up’, is simply adding resources to your server to cope with increased demand. This could be CPU cores, additional RAM, extending disk volumes, etc.

No changes are made to the application code and no additional servers are added, you are just making the server you have more powerful. Or, in the case of scaling back down again, less powerful.

This is a very commonly used scaling method, but it does of course have very finite limits to how far you can scale—the limit being the largest cloud instance you can use.

Nowadays, you can get some pretty huge instances with lots of cores and terabytes of RAM, but you are of course using a single instance, which is a single point of failure for your application. You will also require downtime (usually just a reboot) to scale up or scale down.

Horizontal Scaling


Horizontal scaling, also known as ‘scaling out’, is adding infrastructure to the application. Horizontal scaling requires your application to be broken into ‘tiers’ or ‘microservices‘ and is therefore more complex and costly than vertical scaling, but with the benefit of almost limitless scaling.

Consider a simple three-tier web application, with web, application logic and database tiers. As the load on the site increases, the first part of the application to take the load will be the web tier.

This can therefore be scaled independently of the app logic and database tiers, by simply adding additional web servers and then load balancing the traffic across them.

Diagonal Scaling

up right

Diagonal scaling is a combination of using both horizontal and vertical scaling in the same application. Take again the example of our three-tier web application.

While it may be simple to add web servers to cope with additional web traffic, it may not be possible to split the application logic over multiple servers. In that scenario, you could apply vertical scaling to the application logic tier, while still utilizing horizontal scaling for the web tier.

AWS Scalability

So, now that we understand a little more about the different types of cloud computing scalability, let’s look at what services AWS has available to facilitate scaling in the AWS cloud.

Vertical Scaling in AWS

Let’s take a look at some of the limits of vertical scaling in some of the more commonly used AWS services. 

EC2 Instances

Amazon EC2

EC2 instances are virtual servers in the AWS cloud. AWS has a huge range of EC2 instances for different workload types. The largest Compute Optimized EC2 instance (c5d.metal) has 96 vCPUs. The largest Memory Optimized instance (u-24tb1.metal) has 24TB (yes, Terabytes) of RAM!  That’s some pretty serious compute. 

EBS Volumes

Amazon Elastic Block Store EBS light bg

EBS volumes are the hard disk drive volumes which can be attached to EC2 instances. A single General Purpose (GP2) EBS volume can scale to 16TB and 10,000 IOPS. A Provisioned IOPS EBS volume can scale to 16TB and 64,000 IOPS. Crazy performance for a single volume.

EFS Volumes

Amazon Elastic File System EFS light bg

EFS (Elastic File System) is a shared storage volume that can be mounted via NFS to a Linux operating system, enabling multiple instances to see the same disk volume. With EFS storage you only pay for what you consume, but an EFS volume is virtually infinitely scalable.

In fact, in our monitoring system our customer EFS volumes show as having 8 exabytes of storage available!  I have not checked to see what it would actually cost to store 8 exabytes in EFS—although I think you would probably want to put some of that data elsewhere, such as S3!

S3 Object Storage

Amazon Simple Storage Service S3 light

Amazon S3 (Simple Storage Service) is the AWS Object Storage Solution—think documents, videos, audio files, etc.  Again, S3 is almost infinitely scalable and you only pay for what you store in it. A single object can be up to 5TB in size, and there is NO LIMIT to the number of objects that you can store in a bucket.

That’s a very bold claim, and I guess one that nobody really wants to test the limit of, as that would be a very expensive experiment! 

Horizontal Scaling in AWS

AWS has some crazy vertical scaling limits, but in order for your application to have truly limitless scaling, it needs to be able to scale horizontally. Here are some of the AWS services that facilitate horizontal scaling. 

Application Load Balancer

Elastic Load Balancing ELB Application load balancer light bg

Application load balancers operate at layer 7 of the OSI model—the application layer. They are application aware and can load balance HTTP and HTTPS traffic. You can create advanced request routing to distribute load to specific EC2 instances.

Application load balancers do have a few default limits set, some of which can be raised on request. The default limits include 1,000 targets per ALB, 50 listeners per ALB, 50 ALBs per region and 3,000 targets per region. So, although there are limits, they are pretty generous.

Network Load Balancer

Elastic Load Balancing ELB Network load balancer light bg

Network load balancers operate at layer 4 of the OSI model—the transport layer. They can load balance tens of millions of requests per second at very low latency. Again, you can have 50 NLBs per region and 3,000 target groups per region by default.

Regions and Availability Zones

Region light bg

Regions and availability zones enable you to horizontally scale your application across datacenters and geographies to ensure resilience and proximity to users. An availability zone consists of one or more datacentres in a geographic region, which are physically isolated from one another in terms of power, network and security. 

When scaling horizontally, it is best practice to spread workloads across multiple availability zones to mitigate the risk of hardware or facility failure. A region is a geographic area containing two or more availability zones. At the time of writing, AWS had 76 availability zones across 24 geographic regions.

Scaling an application across multiple regions can help ensure the best low latency experience for users of the application. 

Autoscaling Groups

Auto Scaling light bg

Autoscaling groups enable fleets of EC2 instances to grow and shrink in line with application traffic or demand. An autoscaling group is defined by a launch configuration on a load balancer.

The launch configuration defines the minimum and maximum number of EC2 instances in the group, and the metrics which trigger the launch of new instances. The triggers can be based on instance health checks, CPU load, and network traffic in or out or the number of load balancer requests per target.

By default, AWS allows you to create 200 launch configurations and 200 scaling groups per region. Again, increases to these limits can be requested. Don’t forget, the default quota for EC2 instances is only 20 per region, so it’s likely you’ll hit this scaling limit first. 

Elastic Beanstalk

AWS Elastic Beanstalk light bg

Elastic Beanstalk enables you to create simple web applications that scale automatically without you having to think about any of the underlying infrastructure, such as load balancers, EC2 instances and databases.

Elastic Beanstalk supports web apps written in PHP, Ruby, Tomcat, .Net and IIS. Simply upload your application code and Elastic Beanstalk takes care of the rest! 

Elastic Container Service (ECS)

Amazon Elastic Container Service

Amazon ECS is a fully-managed container orchestration service. Containerized applications lend themselves very well to horizontal scaling. By default, ECS can manage 10,000 clusters per region, with 1,000 services per cluster.


Related reading: Optimizing and Horizontally Scaling Laravel on AWS

AWS Scalability vs. Elasticity: What’s the Difference?

In the context of Amazon Web Services (AWS), scalability and elasticity are both important features that enable customers to adjust their computing resources to meet changing demand.

But, while they are similar in concept, there is a key difference between the two. Elasticity is a more automated and dynamic process, where resources are adjusted in real-time based on demand. With scalability, adding or removing resources in response to changing demand may require more human intervention.

Final Thoughts

So, there you have it, just some of the reasons why we love AWS scalability. We’ve not touched on diagonal scaling in AWS—suffice to say, you can leverage a combination of both the horizontal and vertical scaling options as outlined above.

And, of course, this is not an exhaustive list—we have not touched on the scalability of AWS databases such as Amazon Aurora or DynamoDB, nor have we considered the scalability of serverless services such as AWS Lambda. Perhaps more on that in a later post!

For now, get in touch if you’d like to learn more about Logicata’s AWS Cloud Managed Services and how we can help you facilitate a seamless migration.

You Might Be Also Interested In These...

AWS reInvent

28 AWS Launches Announced by Andy Jassy at re:Invent 2020

Today, AWS CEO Andy Jassy launched the first online AWS re:Invent conference via live stream from Seattle.  With a lively 30 minute set from Zach Person, the online event kicked off on as much of a high as the Vegas conferences.  Awesome production quality as we’ve come to expect from AWS events.   Before getting to […]

View Post
Managed Cassandra

Why Should you Care About Amazon Keyspaces (Managed Apache Cassandra Service)?

**Updated 24th April 2020 when Amazon Managed Apache Cassandra Service went GA as Amazon Keyspaces** Last week at re:Invent 2019, AWS CEO Andy Jassy announced the launch of the preview for the new Amazon Managed Apache Cassandra service (launched in GA as Amazon Keyspaces).  But what exactly is it, and why should you care?   Apache Cassandra […]

View Post
6 Rs of cloud migration - Graphic depicting all the aspects of cloud migration

What Are the 6 Rs of Cloud Migration?

When looking to migrate your on-premises IT infrastructure and applications to the public cloud, there are six strategies that you can adopt. It is important to analyze your existing application portfolio and categorize them against the 6 Rs so you can build out your public cloud migration plan. What Exactly Are the 6 Rs? In 2011 […]

View Post
ebook featured image

5 Steps to a Successful

AWS Migration