Cassandra Database Cheatsheet

Cassandra is a highly scalable and distributed NoSQL database that is designed to handle large amounts of data across multiple commodity servers, providing high availability and fault tolerance. Whether you are a beginner or an experienced developer, having a cheatsheet handy can be incredibly useful for quick reference. In this blog post, we’ll provide you with a Cassandra Database cheatsheet, organized with helpful headings and code snippets.

Getting Started

1. Installation

# Download and install Cassandra
brew install cassandra

# Start Cassandra service
brew services start cassandra

2. Accessing the Cassandra Shell

# Connect to the Cassandra shell

Basic Operations

3. Keyspace Operations

Creating a Keyspace

WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};

Switching to a Keyspace

USE my_keyspace;

Listing Keyspaces

DESCRIBE keyspaces;

4. Table Operations

Creating a Table

   user_id UUID PRIMARY KEY,
   username TEXT,
   email TEXT

Inserting Data

INSERT INTO users (user_id, username, email)
VALUES (uuid(), 'john_doe', '[email protected]');

Querying Data

SELECT * FROM users WHERE user_id = uuid();

5. Updating and Deleting Data

Updating Data

UPDATE users SET email = '[email protected]' WHERE user_id = uuid();

Deleting Data

DELETE FROM users WHERE user_id = uuid();

Advanced Operations

6. Secondary Index

Creating a Secondary Index

CREATE INDEX idx_username ON users (username);

7. Batch Statements

Using Batch Statements

  INSERT INTO users (user_id, username, email) VALUES (uuid(), 'user1', '[email protected]');
  UPDATE users SET email = '[email protected]' WHERE user_id = uuid();

8. Data Consistency

Setting Consistency Level


9. Working with Collections

Using Lists

CREATE TABLE shopping_cart (
   user_id UUID PRIMARY KEY,
   items LIST<TEXT>

Using Sets

   article_id UUID PRIMARY KEY,
   tag_set SET<TEXT>

10. CQL Tracing

Enabling Tracing for a Query


Viewing Tracing Results


This Cassandra Database cheatsheet serves as a quick reference guide for common operations when working with Cassandra. Whether you’re setting up keyspaces, creating tables, or performing more advanced tasks, having these snippets at your fingertips can save you time and streamline your development process. Keep this cheatsheet handy, and feel free to customize it based on your specific use cases and requirements.


1. What is Cassandra’s primary use case?

Cassandra is designed for handling large volumes of distributed data across multiple nodes. Its primary use case is in scenarios where scalability, fault tolerance, and high availability are crucial, making it ideal for applications with massive amounts of read and write operations.

2. How does Cassandra achieve fault tolerance?

Cassandra achieves fault tolerance through its distributed architecture. Data is replicated across multiple nodes, and even if one node fails, the system can continue to operate seamlessly. The replication strategy, such as the SimpleStrategy or NetworkTopologyStrategy, determines how data is distributed.

3. Can I perform JOIN operations in Cassandra?

No, Cassandra does not support traditional JOIN operations. It follows a denormalized data model, and data modeling is based on query patterns. Instead of joins, data is modeled to suit the specific queries, often resulting in duplicate data across tables.

4. What is the purpose of consistency levels in Cassandra?

Consistency levels in Cassandra dictate the degree to which all nodes in a distributed database must agree on a read or write operation. Higher consistency levels ensure stronger data consistency but may impact system performance. Developers choose consistency levels based on their application’s requirements.

5. How does Cassandra handle schema changes?

Cassandra supports schema changes dynamically. You can alter keyspaces and tables on the fly, adding or removing columns. These changes are applied across the cluster, and Cassandra handles the migration of data seamlessly. However, careful consideration is required to avoid potential pitfalls during schema modifications in production environments.