Why Should you Care About Amazon Keyspaces (Managed Apache Cassandra Service)?

Voiced by Amazon Polly

**Updated 24th April 2020 when Amazon Managed Apache Cassandra Service went GA as Amazon Keyspaces**

Last week at re:Invent 2019, AWS CEO Andy Jassy announced the launch of the preview for the new Amazon Managed Apache Cassandra service (launched in GA as Amazon Keyspaces).  But what exactly is it, and why should you care?

Managed Cassandra

Apache Cassandra is a free, datacenter scale No SQL database, originally designed by Facebook engineers to power the Facebook Inbox Search feature, and then later open sourced by Facebook.  Apache Cassandra is designed to manage huge datasets across commodity server clusters, providing high availability with no single points of failure.  Asynchronous replication between datacenters ensures low latency performance and the ability to lose an entire datacenter without loss of data.

According to DB-Engines ranking, Cassandra is the 10th most popular database, and the most popular wide column format database.

Cassandra

5 Key Challenges running Cassandra…

Despite the popularity of Cassandra, it is notoriously difficult to set up and manage.  Some common challenges that Cassandra developers and administrators encounter include:

Read Time Degradation – when data is deleted from Cassandra, it is not immediately deleted from disk – instead it is replaced with a ‘Tombstone’ – a marker to show that the data has been deleted.  The default duration for a tombstone is 10 days, but in certain circumstances they may not be deleted at all.  Tombstones can affect read performance in a database that is filling up rapidly.

Slow nodes can bring down the cluster – slow nodes can be caused by slow network connectivity, saturated hardware or a mismatch between the data stream and the schema.

Failed operations –  chunks of data can get ‘stuck’ due to latency issues which can be caused by hardware being unable to handle the volume of triggers – either due to the hardware being under specified or the software relaying incorrect triggers.

High Frequency of read round trips – due to the design of Cassandra it is common for transactions to occur which make too many requests per end user.  These transactions attempt to read too much data from the database which can slow the transaction down.

Planning for peak capacity – if your database workloads experience busy peaks, then the cluster capacity needs to be scaled to handle those peaks – this can be a tricky capacity planning exercise, particularly with a rapidly growing database.

aws

Enter Amazon Keyspaces…

If you are an Apache Cassandra user and the above issues sound familiar, then you should take a serious look at the new Amazon Keyspaces – Managed Cassandra as a Service.  Keyspaces  enables AWS customers to operate Cassandra at scale.

Amazon Keyspaces is noe Generally Available in the following regions:

  • US East (N. Virginia)
  • US East (Ohio)
  • US West (Oregon)
  • US West (N. California)
  • Europe (Frankfurt)
  • Europe (Ireland)
  • Europe (London)
  • Europe (Paris)
  • Europe (Stockholm)
  • Asia Pacific (Hong Kong)
  • Asia Pacific (Mumbai)
  • Asia Pacific (Seoul)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Tokyo)
  • Canada (Central)
  • Middle East (Bahrain)
  • South America (Sao Paulo)

The benefits of Amazon Keyspaces include:

  • Apache Cassandra-compatible (release 3.11) – The Amazon Keyspaces service implements the Cassandra Cassandra Query Language API, so existing Cassandra users simply need to update their application endpoint to start using the Keyspaces service.  Customers can use the same Cassandra drivers and tools for easy migration.
  • Serverless – no clusters to manage – you will no longer need to provision, patch or manage servers as your database grows – you can focus on building your application.  And as with most AWS services, you pay only for the resources you use, so you you don’t need to worry about capacity planning or peaks and troughs in workloads.  You can simply log on to the service and create a key store and tables without worrying about infrastructure.
  • Single digit millisecond performance at any scale – Keyspaces customers can build applications that have access to virtually limitless throughput and storage that can handle thousands of requests per second.
  • Integrated with IAM with Amazon Cloudwatch, Amazon VPC and AWS Key Management – making Amazon Keyspaces secure, yet easy to integrate and manage.
Exclamation point sign in red triangle. Vector icon

But beware the Key functional Differences…

Despite the Keyspaces preview looking impressive, AWS themselves highlight some key functional differences between Keyspaces and Apache Cassandra, and if you are a current open source user it is important to bear these in mind:

  • No multi-region support
  • No UDT
  • No ALTER TABLE
  • No counters
  • No materialized views
  • No ability to load SSTables directly

Logicata’s View

If you are managing Cassandra at scale and are familiar with the challenges highlighted in this post, then Keyspaces is worth a serious look, as long as you can live with the key functional differences.  While Keyspaces is Pay-as-You-Go, it doesn’t come cheap at $1.45 per million writes which can soon add up if you are operating at scale.  However, like with all AWS Managed Services you will need to consider the total cost of ownership of a self managed cluster versus Amazon Keyspaces, taking into account peaks and troughs in database workload, facilities and hardware management and engineering resource.

You Might Be Also Interested In These...

Annotation 2019-12-03

21 Brand New AWS Services Announced by Andy Jassy at re:Invent 2019

Today, AWS CEO Andy Jassy launched the annual AWS re:Invent conference with his 3 hour long keynote addressing the 65,000 attendees.  With the CEO of Goldman Sachs DJing before the event, and the re:Invent band introducing Andy’s announcements there was plenty of razzmatazz.  Don’t have time to watch the 3 hour replay?  Here are the […]

View Post
Data Driven Organization

11 Key Announcements by Swami Sivasubramanian at re:Invent 2021

On the third day of AWS re:Invent 2021, Dr Swami Sivasubramanian took to the stage to delivery his keynote about all things data and machine learning.  Today the crowd were warmed up by dancing DJ Jen Lasher, who was very energetic for 8:30am! After setting the scene with the explosion of both structured and unstructured […]

View Post
Winners' podium with three trophy icons in flat style

Kubernetes PaaS Comparison: EKS vs AKS vs GKE

The world is going crazy for containers and Kubernetes right now. Container adoption is on the rise—according to the Flexera 2020 State of the Cloud report, containers are now mainstream, with 65% of respondents using Docker for containers and 58% using Kubernetes. In addition to the significant level of adoption, 51% of respondents said that […]

View Post
ebook featured image

5 Steps to a Successful

AWS Migration

DOWNLOAD FREE EBOOK