I’ve been working with AWS for a while now, and almost everywhere I’ve worked, I’ve had to produce an AWS services glossary. It’s an explainer document so that senior management can understand what I’m going on about when I spout AWS terminology.
So, to save you all that pain, here it is. My full, unabridged, and hopefully somewhat useful, AWS terminology cheat sheet.
Well, not full. AWS has over 200 services and releases new services and service updates more often than most people change their underwear. So here’s the “core set” of AWS terms you’ll likely have to explain.
Yep, let’s start at the beginning. AWS is an acronym (and there are a few of these coming up) for Amazon Web Services. I’m guessing you already knew that one though.
Interesting (?) fact, AWS made all of Amazon’s operating profit in 2021 (source). So you can feel better about all the money you’ve spent on Amazon. At least that’s what I’m clinging to. Read on for more AWS terms and definitions.
Compute and Storage Services
These start with things that most sysadmin types would recognize (servers, storage, etc.) and moves into the more modern areas (containers, lambda, that sort of thing). We’re going to be quite heavy on the AWS acronyms here, so I apologize in advance.
In this sense, “Elastic” is not that far from an elastic band. The capacity of your resources can stretch and shrink to meet demand (within limits). Compute is running apps, although in this case, it refers to virtual servers or virtual machines (VMs).
This is the sort of thing that your friendly neighbourhood sysadmin would be familiar with. Until you get into the realms of auto-scaling groups, which use magic (metric tracking, really) to spin up new VMs in response to increased demand.
Elastic Block Store
This is a virtual disk, but it’s a type of disk suited to reading and writing in “blocks”. Databases tend to use this sort of storage type, as it has a much faster read and write speed.
Basically a network drive. Cool pricing model—you just use it and pay for what you use. Unlike disk-based storage pricing, such as EBS, where you have to provision and pay for a whole disk. One less headache.
It’s got 3 S’s, so S3. This is an object store, rather than a file store. It’s interchangeable with file storage to an extent. But instead of using native OS commands, you interact with it using the AWS CLI tool.
S3 also has the ability to run static websites (so no server-side content, sorry PHP devs), and in combination with CloudFront (I’ll talk about that later) can publish your content globally, with extremely low latency.
ECS and ECR
Elastic Container Service and Elastic Container Registry
Ooh, look, containers! Told you we’d get modern eventually.
ECS is Amazon’s service for orchestrating Docker containers (sort of Amazon’s take on Docker Swarm I guess?). ECR is their version of Docker Hub so that you can store all your Docker images inside AWS. Great if your InfoSec people don’t like the idea of data leaving controlled environments.
More containers, yay! AWS managed Kubernetes, so you don’t have to worry about anything on the control plane, for the low price of $0.10/hour per cluster.
EKS is backed by EC2 to supply the compute nodes for the control plane to allocate pods to. This comes in two forms: EC2 (you provision the servers, and tell them to register to the cluster) or Fargate (AWS does all of that for you, but for more money).
Fully into the realms of “that can’t be done” from 10 years ago, AWS Lambda is a way to “run code without thinking about servers” (straight from the AWS horse’s mouth).
It’s part of a new generation of compute services called FaaS (Function as a Service), which supports all major programming languages. Additionally, Lambda can run docker images for anything it doesn’t natively support.
Is this compute? I’m calling it compute-adjacent and leaving it here.
AWS has two offerings for caching services (both under the banner of Elasticache): Redis and Memcached. There are reasons why you’d use one over the other, but I’m not going to go into that here (use Redis if you value your sanity).
Yet another fairly traceable name. Cache because it’s a cache, elastic because it implements elasticity in the same way the EC2 service does.
Again, compute adjacent.
This used to be AWS ElasticSearch until Elastic somewhat blew up its public image (personal opinion, and does not necessarily reflect the view of Logicata) by changing the license it used, thereby preventing AWS from reselling it.
AWS responded by forking the last open-source version of ElasticSearch and running with it from there.
More compute-adjacent services, but everything in compute sends either logs or metrics here, so here it goes.
You can “watch” your “cloud” resources. CloudWatch. This covers metrics and logs, but there are different charges depending on what you’re looking at.
You can do some cool stuff with logs, like exporting them to other tools for analytics and graphing. CloudWatch is also starting to venture into Application Performance Management (APM) through things like heartbeat monitoring, link checking, and basically anything that you’d historically go elsewhere for.
Right, time for a new category. These services allow communication between services within your account, without a “hard” coupling between them. This design pattern is called (inventively) “loose coupling” and is the darling of the microservices world. And so we continue with AWS terms and definitions…
Simple Queue Service
It’s a queuing service. With a ridiculously high free tier (1 million requests per month!). It was also AWS’s first available service, way back in 2004, predating AWS itself by two years, which is pretty cool.
SNS and SES
Simple Notification Service and Simple Email Service
It sends notifications (think text messages), and emails. This blog is sort of writing itself at this point.
SNS will send emails, but SES gives you more control over the email content. It is however more awkward to configure and requires some DNS shenanigans.
Managed Apache ActiveMQ/RabbitMQ. I can’t really talk about this one at length because I don’t use either, but again, messages → queue → thing to receive messages.
I guess you’d use this if you already used Mq/Rabbit and didn’t want to (or couldn’t) migrate to SQS. It would probably be worth the pain to migrate because this is run from EC2s, so you pay quite a bit more than SQS.
Kinesis and Firehose
These are two different tools, but I’ve lumped them together because they’re used together quite a lot.
Kinesis is (loosely) Greek for movement, and that’s what it does. Moves data.
It comes in two forms: Streams and Firehose.
Kinesis Streams take streaming data and lets you do transformations on the data before outputting it. You could use this to dynamically change the content of a webpage as a user is interacting with it. Pretty cool huh?
Kinesis Firehose allows you to continuously stream data from disparate inputs (like IoT devices) into either analytics tools (e.g. kinesis streams or custom lambdas) or S3.
This is where the sysadmin types might shout at me because “database” here is used a little loosely.
AWS has a range of database services available, some more “traditional” than others.
Amazon will set up and manage a “Highly Available” (HA) cluster of a database engine of your choice. Not all DBMSs are available (sorry Sybase users), but the common ones are there.
You still get CLI and SSH access too, which is nice if you need/want/like to fine-tune anything.
You can pretty much use this as a drop-in replacement for an on-premises DB cluster, but you can’t quite do without a DBA. You will also need some EBS (see above).
Aurora is part of the RDS family but is fully managed, so you don’t get access to the underlying servers. It’s both MySQL and PostgreSQL compliant. Either/or, but not both at the same time.
DynamoDB is the next extension of AWS’s DB offering. Dynamo is a NoSQL (Not Only SQL) database. It’s largely cheaper to run than RDS/Aurora, and is fully serverless, but doesn’t enforce referential integrity (see here for an explanation). So if you can work with that (and being honest, you probably can) go for DynamoDB.
Dynamo DB pricing can be a little confusing, as you provision the number of reads and writes you need upfront, based on the size of the operations (reads can be bigger than writes for the same number of “capacity units”) and whether you need the latest data or not.
The short version though is just use Pay-As-You-Go pricing until you understand your usage patterns. Or just stay in PAYG forever, as it’s not that much more expensive, and has some free elements.
Redshift is AWS’s data warehousing solution, using columnar storage (most DBs are row-based, with the notable exception of SybaseIQ. Take 10 imaginary points if you’ve heard of that before).
This is one of those tools with a story behind the name, it’s one of two things, and I can’t find anything definitive to confirm either.
Option 1: Redshift is a physical phenomenon, and part of the Doppler effect, where items getting further away appear red. This is usually due to expansion, so this could be the easy way you can expand the size of your Redshift clusters.
Option 2: It’s a swipe at Oracle, which has a red logo. The idea being that teams would shift away from Oracle.
Take your pick which you believe. I think the second option is more likely, but because I’m a nerd I like the first option more.
This is a fully-managed graph database that supports Gremlin, SPARQL, and openCypher for queries. This is a serious competitor to Neo4j in this space, especially if you don’t want to run Neo4j (trust me, I’ve done it, you don’t), or don’t want to use Neo4j’s managed service “Aura”.
Aura honestly isn’t bad, but the scaling of Neptune is much more in line with the rest of AWS (it basically works as RDS does) and means that any traffic to your database doesn’t have to leave AWS, and then come back again.
DocumentDB (with MongoDB compatibility)
Yes, that is actually the service name.
DocumentDB is a way of storing JSON documents (again, an obvious name when you think about it). This is useful because it saves a lot of conversion to/from JSON and whatever format the database is storing data in.
Personally, I’d just use DynamoDB, as it works similarly and is much cheaper. But, if you’re already running a MongoDB instance, it’s a good option.
Couldn’t come up with a good name for this arbitrary collection, so that’ll have to do.
Cloudfront is AWS’s take on a Content Delivery Network (CDN).
Cloudfront is pretty cool because it usese AWS’s existing (and massive) network of servers. It talks natively to other AWS services so that you can include the setup/teardown of your CDN with your web-app deployment. This can take a little while though, so if you’re deploying CloudFront, make a coffee.
An API is an Application Programming Interface (but you knew that already, right?). API Gateway is an easy way to create and publish your APIs so that you can use them with other services (both AWS and not).
The upshot of using API Gateway as opposed to self-hosting your API, is that it talks natively to other AWS services, including CloudWatch. Meaning that you can monitor your API service just like any other AWS-hosted service.
And it’s PaaS. And it’s cheap: 1 million messages per month for free, and $1/million after that, with no “standing cost” for having it available 24/7.
AWS Certificate Manager
It manages SSL/TLS certificates. Really nice tool if you’ve ever had the displeasure of provisioning TLS certs manually *shudder*.
It talks to pretty much all AWS services that could use a certificate (load balancers, CloudFront, API Gateway, you get the idea), and will automatically rotate any certificates that it issued, so you don’t have to fix the SSL error at 4 pm on Christmas Day.
DDoS protection. Sort of what it says on the tin. Works on OSI layer 3 or 4 (OSI model), 24/7 coverage, with a human on the other end for when the heuristics fall over. Nice price protection too — stops you from running up a massive scaling bill because you’ve been DDoS’d.
Web Application Firewall
Firewall as a service at a basic level. At the advanced pricing level for Shield, this comes free, which is nice.
Sorry, I know this one’s dull, but you need it to answer the question “Can other AWS customers see our data?” (Hint: No, they can’t.)
Virtual Private Cloud
To understand this, you need to understand the difference between public and private cloud. The short version is “with a private cloud you own it all and are the only person (or company) on the hardware. With public cloud, none of that is true (in most cases, but I’m not going to go into that here)”.
A VPC allows you to treat AWS as if it’s all yours. You’re not going to see anyone else’s resources when you log in, and they won’t ever see any of yours either.
For the most part in AWS, you have no idea that anyone else is using the service, except for a few unique naming rules.
You do have to be a touch careful with VPC though, because of the isolation that they provide. It’s quite easy to run up a large bill for running traffic between a server in a VPC and S3 (which doesn’t sit in a VPC) unless you provision an endpoint in your VPC for S3. Why this isn’t done by default, I have no idea.
DNS, Amazon style.
Domain Name System is a translator between human-readable web addresses and an IP address. For example, your PC looks up an IP address of 220.127.116.11 on the internet, although you’ve typed in google.co.uk. Route53 is Amazon’s implementation.
This is a piece of kit that you put in your existing datacenter. It gives you access to the storage services (EBS, EFS, S3, etc.) using standard network file transfer protocols (SMB, NFS, iSCSI). Could be viewed as a stepping stone to getting into the cloud, but I think it’s more aimed at being an easier backup solution.
This links your datacentre and the AWS backbone directly, so you’re not talking over a VPN or public internet. This is significantly faster than a VPN and more consistent—no more spikes at peak times.
This is somewhere between compute and networking really (and in truth they live in the EC2 section of the AWS console), but I stretched the limits of what I could call a “compute service” already, so they’re going here instead.
There are four types of load balancers, and which one you’d pick depends on what you need them to do. This section really deserves its own blog post, so maybe I’ll do that eventually, but for now, the four types are:
These didn’t really fit into my arbitrary AWS terminology categories, so you get them as one homogenous lump instead.
Auditing. Well, an audit “trail” on your “cloud”. You absolutely want to turn this on and have it logged to an account that isn’t the one being recorded.
If you’ve ever had the displeasure of trying to work out where half of your servers went at 3 am without audit logs, you’ll understand what I mean.
Identity Access Management. What?
This is AWS’s “permissions” setup. It’s a way to control who gets access to what. Broken down into users, groups, roles and policies. Users go into groups. Roles can be applied to groups or users. Policies are attached to roles.
The upshot of this is servers/other resources can hold an “IAM Role”. This allows them access to do/see/get/change something from another service, without having to create service accounts. If you’ve ever used them in the past, you’ll understand why this is “A Good Thing™”.
It manages secrets (shocking, I know) and allows you to refer to them via their ARN (Amazon Resource Name). This gives you powerful options inside your resource stacks. Like not putting access keys or database passwords in source control, but referring to their ARN.
Athena was (is?) the Greek goddess of wisdom, and the tool allows you to query files directly in S3, using SQL. Thus gaining wisdom?
Quicksight is the graphing tool you can use on top of AWS Athena, to give quick (in)sight into your data.
This is an ETL (Extract Transform Load) tool. ETL is used a lot when creating derived data sets (e.g. data aggregations). Essentially it’s “glueing” data back together.
OpsWorks allows you to run your existing chef (and puppet) code in your AWS account. I think it’s called OpsWorks because in a traditional setup that’s the work of an Ops team.
This monitors your AWS estate and gives you some control over change management and compliance monitoring. Handy when you have a regulator to worry about.
Like EC2 this is a collection of services that do a range of things. The most useful of which is Parameter Store (in my opinion).
SSM Parameter Store stores parameters (you’d think it would, wouldn’t you?). What makes this service interesting though, is the pricing and throughput.
You can store an unlimited number of parameters of up to 4kB each, for free. Yup, free. Above that size, it’s $0.05 per parameter per month.
Encrypted parameters are included in the free offering, so unless you really value automatic rotation (and if you don’t mind writing a lambda to do this instead, it’s a non-issue), it’s a very competitive offering compared to Secrets Manager ($0.40 per month per secret).
The throughput is so high (once you’ve turned on higher throughput) that if you hit the ceiling, I will personally buy you a drink of your choice (up to the value of £4.99, not exchangeable for cash, you must collect in person and supply appropriate evidence).
Okay, I’m done now. Hopefully, this long (but not exhaustive) AWS glossary saves you from having to write your own and saves me from having to do this again.
Or, you could completely skip the headache of explaining all of these AWS abbreviations to your leadership team, and let me do it instead (honestly, I don’t mind), by getting in touch with me via Logicata!
You’re also more than welcome to reach out if you’d like to know more about our AWS cloud managed services. or anything else AWS-related. Thanks for reading!