Looking for greater flexibility, speed, and scale for your Apache Cassandra solutions? It's time to consider Amazon's DynamoDB as a great alternative. We would like to share our hands-...
Why migrate Apache Cassandra to Amazon DynamoDB?
Are you looking for greater flexibility, speed, and scale for your Apache Cassandra solutions? You can consider Amazon’s DynamoDB as an alternative for your existing workloads.
The DB Best migration team has already used the new data extraction agents to migrate Apache Cassandra database to Amazon DynamoDB. So, here we will share our experience with you. Also, we will talk about the unusual migration approach implemented in the AWS Schema Conversion Tool for this pair of databases.
Background on Apache Cassandra
Apache Cassandra is a free and open-source distributed NoSQL database management system. Cassandra is a wide column store database. Rows are organized into tables and the first component of a table’s primary key is the partition key.
Initially, Cassandra was developed to handle large amounts of data across many commodity servers. Since July 2008, when Facebook brought Cassandra to the market, developers created quite a few versions of the Cassandra database. Actually, the code behind these versions is very different, and this brings many incompatibility issues when working with Cassandra workloads. So, you have to pay attention to the Cassandra database version and consider its limitations when you start a migration project.
Background on Amazon DynamoDB
Amazon DynamoDB is a fully managed proprietary NoSQL database service. DynamoDB allows you to create database tables that can store and retrieve any amount of data.
Our developers figured out that DynamoDB databases may serve almost any level of request traffic. You can easily scale up or scale down your DynamoDB tables’ throughput capacity without downtime.
Why choose Amazon DynamoDB instead of Apache Cassandra?
Sometimes Amazon DynamoDB can provide greater scale and performance over your existing Apache or Datastax Cassandra workloads. This very much depends on your database use cases. For example, DynamoDB works better with real-time bidding platforms, gaming applications, and recommendation engines. Cassandra has its own advantages too, however, now we will consider the benefits of Amazon’s solution.
- Performance. DynamoDB scans the data much faster, especially if you don’t have a primary key in your query.
- Consistency. DynamoDB provides you with strong consistency, while Cassandra can have issues with frequently updated data due to latency issues between distributed nodes. For example, Datastax support posted an example of this problem in a post titled – Dude! Where’s my data?
- Global reach. In addition to that, DynamoDB provides Global Tables for deploying multi-region, multi-master databases without you having to maintain your own replication solution.
- Security. This may also be a major concern for Cassandra’s data at rest, while DynamoDB encrypts the data at rest and in transit.
If you’re encountering similar issues, consider moving your workloads to Amazon DynamoDB! However, this may be a challenging task. Let’s discuss some of the key limitations we encountered while migrating Apache Cassandra to Amazon DynamoDB for our customer.
Workload challenges we see migrating Cassandra to Amazon DynamoDB
Despite the similarities in data models, Apache Cassandra and Amazon DynamoDB have some critical architectural differences. Generally speaking, Apache Cassandra is a column-oriented data store, while Amazon DynamoDB is a key-value and document-oriented store.
Cassandra’s table consists of rows. These rows may contain different numbers of columns. Opposed to that, DynamoDB considers rows as database items, and cells as attributes. Here you can define a schema for each item, rather than for the whole table.
Both Cassandra and DynamoDB databases require primary keys for your tables, and they both use partition keys to distribute your data. However, the meaning of partition is different in Cassandra and DynamoDB. In Cassandra, a partition is a set of rows with the same partition key. Therefore, Cassandra stores these rows on one node. In DynamoDB, a partition is a physical part of storage allocated for a particular chunk of a table.
Finally, we should note that Cassandra supports more data types than DynamoDB. So, during while migrating Cassandra to Amazon DynamoDB, you need to consider the right type mapping. Below, you can find additional challenges that emerge when migrating Cassandra to Amazon DynamoDB.
Application conversion
According to DB Best's 12-step Migration Methodology, up to 25% of a migration effort involves updating application code. So, what do I need to effectively update my application code to work with the new database?
Database migration projects include not only schema conversion and data migration. To work properly with your data in the new environment, you need to convert the application code to support the new database platform. In most cases, you can use AWS Schema Conversion Tool to make the application code compatible with the new target database. However, AWS SCT does not support application conversion for Cassandra to DynamoDB migrations. So, you will need to convert your application manually.
Validating migration
After migrating data from one database to another, you want to make sure that all files migrated successfully. In other words, you need to proof your migration. How can I do that?
One of the most important challenges in any database migration project is validation. Of course, you want to be sure that the data in the source and target databases are identical. However, comparing data in two NoSQL databases proves to be a very hard task. However, we recommend using DB Best’s Database Compare Suite to verify your migration. The latest version of our in-house database management utility allows for comparing data between NoSQL Cassandra database and SQL-based counterparts.
Amazon DynamoDB limitations
Amazon DynamoDB database has some restrictions and limitations at its core. So, what do I need to consider to successfully migrate Cassandra workloads to this Amazon's cloud platform?
You need to examine DynamoDB’s limitations before the start of migration. For example, you need to create the right architecture design for database items (rows) because their size is limited by 400 KB. This limit includes both the attribute name binary length and attribute value lengths. The attribute name counts towards the size limit.
Another limitation is related to Cassandra’s collection types. DynamoDB doesn’t support collection types (set, list, and map). Moreover, you can’t use AWS Database Migration Service (DMS) to upload the data of these types to the Amazon cloud.
In addition to that, you should consider that in any AWS account, you can store no more than 256 tables per region. If you reach this limit, you can either restructure your database design or request Amazon support for a service limit increase.
Getting started
DB Best supports the AWS Database Migration Service (DMS) and we were an early adopter of the AWS Schema Conversion Tool (SCT). We have helped hundreds of customers successfully migrate to the Amazon cloud. We use our 12-step migration methodology to help your organization streamline its database migration projects. Below you can find the basic DB Best offerings that can help you cost-effectively accelerate cloud adoption.
Experts in AWS migrations
DB Best has broad experience with moving customer’s workloads to the Amazon cloud. We were engaged in hundreds of successful customer’s migrations from on-premises to AWS. Moreover, DB Best was the first AWS partner to leverage data extraction agents for an Apache Cassandra workloads to Amazon DynamoDB migration project. Find some more exciting details below.
Simplifying the process for migrating Cassandra to Amazon DynamoDB
As part of our AWS Database Migration Service and our partnership with AWS, we helped define the process of using AWS SCT data extraction agents for simplifying the migration process.
AWS DMS supports Apache Cassandra versions 2.0 and 3.0 as a source, with Amazon DynamoDB as a target.
What makes this process interesting is the clone datacenter task. To avoid interfering with production applications that use your Cassandra cluster, AWS SCT will create a clone datacenter and copy your production data into it. The clone datacenter acts as a staging area, so that AWS SCT can perform further migration activities using the clone rather than your production datacenter.
To better understand the overall migration process, check out our blog post on migrating Cassandra to Amazon DynamoDB.
Learn more
Blog posts
Discover our amazing blog posts below that cover our expertise with Apache Cassandra migrations.
We’re happy to announce the latest groundbreaking release of DB Best Database Compare Suite. The new version allows for working both with NoSQL and SQL-based databases, comparing dat...
Reference data sheet
Looking for a greater flexibility, speed, and scale for your MongoDB and Apache Cassandra solutions? Discover how DB Best can take your data estate to a whole new level. Read our Amazo...