In this post we’ll take a look at one of the key benefits of Cloud Computing – Scalability. We’ll explore the different scaling options available for your cloud based workloads, and then we’ll take a look at specific services in AWS which can help you to achieve scalability for your AWS hosted applications.
Whether you’re starting out with a single instance and hoping to grow, or you have a globally distributed application, you always need to be factoring scalability into your application architecture. Most businesses are looking to grow, so you need to make sure that your application infrastructure can grow in line with customer demand. But not only grow – you may want to be able to shrink resources again during quieter periods to keep infrastructure costs in line with your usage patterns.
Cloud Computing Scalability
Scalability is one of the key benefits of Cloud Computing. Having access to seemingly limitless resources to some extent takes away the headache of how to scale your application infrastructure in line with demand, but you need to ensure that your application is designed to leverage the cloud infrastructure in the most efficient way possible, to ensure that your infrastructure can grow and shrink with your business requirements.
Let’s firstly take a look at the different ways in which a cloud hosted application can scale – Vertically, Horizontally or Diagonally.
Vertical scaling, also known as ‘Scaling Up’, is simply adding resources to your server to cope with increased demand. This could be CPU cores, additional RAM, extending disk volumes etc. No changes are made to the application code, and no additional servers are added, you are just making the server you have more powerful, or indeed less powerful if you want to scale back down again. This is a very commonly used scaling method but it does of course have very finite limits to how far you scan scale – the limit being the largest cloud instance you can use. Nowadays you can get some pretty huge instances with lots of cores and terabytes of RAM, but you are of course using a single instance which is a single point of failure for your application. You will also require downtime (usually just a reboot) to scale up or scale down.
Horizontal scaling, also known as ‘Scaling Out’, is adding infrastructure to the application. Horizontal scaling requires your application to be broken into ‘tiers’ or ‘microservices‘ and is therefore more complex and costly than vertical scaling, but with the benefit of almost limitless scaling. Consider a simple 3 tier web application, with web, application logic and database tiers. As the load on the site increases, the first part of the application to take the load will be the web tier. This can therefore be scaled independently of the app logic and database tiers, by simply adding additional web servers and load balancing the traffic across them.
Diagonal scaling is a combination of using both Vertical and Horizontal scaling in the same application. Take again the example of our 3 tier web application. Whilst it may be simple to add web servers to cope with additional web traffic, it may not be possible to split the application logic over multiple servers. In that scenario, you could apply vertical scaling to the application logic tier, whilst still utilizing horizontal scaling for the web tier.
So now we understand a little more about the different types of cloud computing scalability, let’s look at what services AWS has available to facilitate scaling in the AWS cloud.
Vertical Scaling in AWS
Let’s take a look at some of the limits of vertical scaling in some of the more commonly used AWS services.
EC2 instances are virtual servers in the AWS cloud. AWS has a huge range of EC2 instances for different workload types. The largest Compute Optimized EC2 instance (c5d.metal) has 96 vCPUs. The largest Memory Optimized instance (u-24tb1.metal) has 24TB (yes Terabytes) of RAM! That’s some pretty serious compute.
EBS Volumes are the hard disk drive volumes which can be attached to EC2 instances. A single General Purpose (GP2) EBS volume can scale to 16TB and 10,000 IOPS. A Provisioned IOPS EBS volume can scale to 16TB and 64,000 IOPS. Crazy performance for a single volume.
EFS or Elastic File System is a shared storage volume that can be mounted via NFS to a Linux operating system, enabling multiple instances to see the same disk volume. With EFS Storage you only pay for what you consume, but an EFS volume is virtually infinitely scalable. In fact in our monitoring system our customer EFS volumes show as having 8 Exabytes of storage available! I have not checked to see what it would actually cost to store 8 exabytes in EFS – although I think you would probably want to put some of that data elsewhere, such as S3!
S3 Object Storage
S3 or Simple Storage Service is the AWS Object Storage Solution – think documents, videos, audio files etc. Again S3 is almost infinitely scalable and you only pay for what you store in it. A single object can be up to 5TB in size, and there is NO LIMIT to the number of objects that you can store in a bucket. That’s a very bold claim, and I guess one that nobody really wants to test the limit of as that would be a very expensive experiment!
Horizontal Scaling in AWS
AWS have some crazy Vertical Scaling limits, but in order for your application to have truly limitless scaling, it needs to be able to scale horizontally. Here are some of the AWS services which facilitate horizontal scaling.
Application Load Balancer
Application Load Balancers operate at layer 7 of the OSI model – the application layer. They are application aware and can load balance HTTP and HTTPS traffic. You can create advanced request routing to distribute load to specific EC2 instances. Application Load Balancers do have some default limits set, some of which can be raised on request. The default limits include 1000 Targets per ALB, 50 listeners per ALB, 50 ALBs per region and 3,000 targets per region. So although they are limits, they are pretty generous.
Network Load Balancer
Network Load Balancers operate at layer 4 of the OSI model, the transport layer. They can load balance 10s of millions of requests per second at very low latency. Again you can have 50 NLBs per region and 3000 target groups per region by default.
Regions & Availability Zones
Regions and availability zones enable you horizontally scale your application across datacenters and geographies to ensure resilience and proximity to users. An Availability Zone consists of 1 or more datacentres in a geographic region which are physically isolated from one another in terms of power, network and security. When scaling horizontally it is best practise to spread workloads across multiple Availability Zones to mitigate the risk of hardware or facility failure. A region is a geographic area containing 2 or more Availability Zones. At the time of writing, AWS had 76 availability zones across 24 geographic regions. Scaling an application across multiple regions can help ensure the best low latency experience for users of the application.
Autoscaling groups enable fleets of EC2 instances to grow and shrink in line with application traffic or demand. An Autoscaling group is defined by a launch configuration on a load balancer. The launch configuration defines the minimum and maximum number of EC2 instances in the group, and the metrics which trigger the launch of new instances. The triggers can be based on instance health checks, CPU Load, network traffic in or out or the number of load balancer requests per target. By default AWS allows you to create 200 launch configurations and 200 scaling groups per region. Again increases to these limits can be requested. Don’t forget the default quota for EC2 instances is only 20 per region, so it’s likely you’ll hit this scaling limit first.
Elastic Beanstalk enables you to create simple web applications which scale automatically, without you having to think about any of the underlying infrastructure such as load balancers, EC2 instances and databases. Elastic Beanstalk supports web apps written in PHP, Ruby, Tomcat, .Net and IIS. Simply upload your application code, and Elastic Beanstalk takes care of the rest!
Elastic Container Service (ECS)
Amazon ECS is a fully managed container orchestration service. Containerized applications lend themselves very well to horizontal scaling. By default ECS can manage 10,000 clusters per region, with 1,000 services per cluster. So there you have it, just some of the reasons why we love AWS scalability. We’ve not touched on diagonal scaling in AWS – suffice to say you can leverage a combination of both the horizontal and vertical scaling options as outlined above. And of course this is not an exhaustive list – we have not touched on the scalability of AWS databases such as Amazon Aurora or DynamoDB, nor have we considered the scalability of serverless services such as AWS Lambda. Perhaps more on that in a later post!