Useful Links
Computer Science
Databases
Apache Cassandra
1. Introduction to NoSQL and Distributed Databases
2. Fundamentals of Apache Cassandra
3. Cassandra Architecture
4. The Cassandra Storage Engine
5. Data Modeling in Cassandra
6. Cassandra Query Language
7. Setting Up and Configuring a Cassandra Cluster
8. Cluster Management and Operations
9. Performance Tuning and Monitoring
10. Cassandra Security
11. Integrating Cassandra with Other Technologies
Data Modeling in Cassandra
From Relational to Cassandra Thinking
Denormalization Principles
Data Duplication
Trade-offs
Storage Implications
Query-First Design Approach
Designing for Access Patterns
Avoiding Joins
Query-Driven Schema
Conceptual Differences
No Foreign Keys
No Referential Integrity
No ACID Transactions
Core Data Modeling Concepts
Keyspace
Definition and Purpose
Replication Settings
Keyspace Properties
Table
Structure and Schema
Column Families
Table Properties
Row
Row Key
Row Storage
Row-Level Operations
Column
Column Name
Column Value
Column Types
Cell
Cell Structure
Timestamps
TTL Values
The Primary Key
Partition Key
Single-Column Partition Key
Composite Partition Key
Multiple Columns
Partitioning Impact
Hash Distribution
Clustering Columns
Purpose and Function
Clustering Order
Ascending Order
Descending Order
Mixed Ordering
Sorting within Partitions
Primary Key Design Patterns
Simple Primary Key
Compound Primary Key
Composite Primary Key
Data Types
Primitive Types
Text
Varchar
ASCII
Int
Bigint
Smallint
Tinyint
Varint
UUID
Timeuuid
Boolean
Float
Double
Decimal
Timestamp
Date
Time
Inet
Blob
Collection Types
List
Ordered Collections
Use Cases
Limitations
Set
Unordered Unique Collections
Use Cases
Limitations
Map
Key-Value Collections
Use Cases
Limitations
Frozen Collections
Immutable Collections
Performance Benefits
User-Defined Types
Creating UDTs
Using UDTs in Tables
Nested UDTs
UDT Evolution
Static Columns
Purpose and Use Cases
Partition-Level Data
Restrictions and Limitations
Counter Columns
Counter Data Type
Counter Operations
Counter Limitations
Modeling Techniques and Best Practices
One-to-One Relationships
Modeling Approach
Embedded Data
One-to-Many Relationships
Partitioning Strategies
Clustering Approaches
Many-to-Many Relationships
Denormalization Techniques
Multiple Table Approach
Modeling for Time-Series Data
Bucketing Strategies
Time-Based Bucketing
Size-Based Bucketing
Partition Sizing
TTL Usage
Hierarchical Data
Tree Structures
Path-Based Modeling
Materialized Views
Use Cases
Automatic Maintenance
Limitations and Caveats
Performance Considerations
Secondary Indexes
Purpose and Use Cases
Index Types
Simple Secondary Indexes
Composite Secondary Indexes
Performance Implications
Query Performance
Write Performance
When to Avoid Secondary Indexes
High Cardinality Columns
Frequently Updated Columns
SASI Indexes
String Analysis
Numeric Analysis
Data Modeling Anti-Patterns
Unbounded Row Growth
Causes and Consequences
Prevention Strategies
Large Partitions
Performance Impact
Size Limitations
Hot Partitions
Load Distribution Issues
Mitigation Strategies
Overuse of Secondary Indexes
Scalability Issues
Alternative Approaches
Using Cassandra as a Queue
Problems with Queue Patterns
Alternatives and Recommendations
Excessive Denormalization
Maintenance Overhead
Consistency Challenges
Previous
4. The Cassandra Storage Engine
Go to top
Next
6. Cassandra Query Language