Apache Cassandra

  1. Data Modeling in Cassandra
    1. From Relational to Cassandra Thinking
      1. Denormalization Principles
        1. Data Duplication
          1. Trade-offs
            1. Storage Implications
            2. Query-First Design Approach
              1. Designing for Access Patterns
                1. Avoiding Joins
                  1. Query-Driven Schema
                  2. Conceptual Differences
                    1. No Foreign Keys
                      1. No Referential Integrity
                        1. No ACID Transactions
                      2. Core Data Modeling Concepts
                        1. Keyspace
                          1. Definition and Purpose
                            1. Replication Settings
                              1. Keyspace Properties
                              2. Table
                                1. Structure and Schema
                                  1. Column Families
                                    1. Table Properties
                                    2. Row
                                      1. Row Key
                                        1. Row Storage
                                          1. Row-Level Operations
                                          2. Column
                                            1. Column Name
                                              1. Column Value
                                                1. Column Types
                                                2. Cell
                                                  1. Cell Structure
                                                    1. Timestamps
                                                      1. TTL Values
                                                    2. The Primary Key
                                                      1. Partition Key
                                                        1. Single-Column Partition Key
                                                          1. Composite Partition Key
                                                            1. Multiple Columns
                                                              1. Partitioning Impact
                                                                1. Hash Distribution
                                                              2. Clustering Columns
                                                                1. Purpose and Function
                                                                  1. Clustering Order
                                                                    1. Ascending Order
                                                                      1. Descending Order
                                                                        1. Mixed Ordering
                                                                        2. Sorting within Partitions
                                                                        3. Primary Key Design Patterns
                                                                          1. Simple Primary Key
                                                                            1. Compound Primary Key
                                                                              1. Composite Primary Key
                                                                            2. Data Types
                                                                              1. Primitive Types
                                                                                1. Text
                                                                                  1. Varchar
                                                                                    1. ASCII
                                                                                      1. Int
                                                                                        1. Bigint
                                                                                          1. Smallint
                                                                                            1. Tinyint
                                                                                              1. Varint
                                                                                                1. UUID
                                                                                                  1. Timeuuid
                                                                                                    1. Boolean
                                                                                                      1. Float
                                                                                                        1. Double
                                                                                                          1. Decimal
                                                                                                            1. Timestamp
                                                                                                              1. Date
                                                                                                                1. Time
                                                                                                                  1. Inet
                                                                                                                    1. Blob
                                                                                                                    2. Collection Types
                                                                                                                      1. List
                                                                                                                        1. Ordered Collections
                                                                                                                          1. Use Cases
                                                                                                                            1. Limitations
                                                                                                                            2. Set
                                                                                                                              1. Unordered Unique Collections
                                                                                                                                1. Use Cases
                                                                                                                                  1. Limitations
                                                                                                                                  2. Map
                                                                                                                                    1. Key-Value Collections
                                                                                                                                      1. Use Cases
                                                                                                                                        1. Limitations
                                                                                                                                        2. Frozen Collections
                                                                                                                                          1. Immutable Collections
                                                                                                                                            1. Performance Benefits
                                                                                                                                          2. User-Defined Types
                                                                                                                                            1. Creating UDTs
                                                                                                                                              1. Using UDTs in Tables
                                                                                                                                                1. Nested UDTs
                                                                                                                                                  1. UDT Evolution
                                                                                                                                                  2. Static Columns
                                                                                                                                                    1. Purpose and Use Cases
                                                                                                                                                      1. Partition-Level Data
                                                                                                                                                        1. Restrictions and Limitations
                                                                                                                                                        2. Counter Columns
                                                                                                                                                          1. Counter Data Type
                                                                                                                                                            1. Counter Operations
                                                                                                                                                              1. Counter Limitations
                                                                                                                                                            2. Modeling Techniques and Best Practices
                                                                                                                                                              1. One-to-One Relationships
                                                                                                                                                                1. Modeling Approach
                                                                                                                                                                  1. Embedded Data
                                                                                                                                                                  2. One-to-Many Relationships
                                                                                                                                                                    1. Partitioning Strategies
                                                                                                                                                                      1. Clustering Approaches
                                                                                                                                                                      2. Many-to-Many Relationships
                                                                                                                                                                        1. Denormalization Techniques
                                                                                                                                                                          1. Multiple Table Approach
                                                                                                                                                                          2. Modeling for Time-Series Data
                                                                                                                                                                            1. Bucketing Strategies
                                                                                                                                                                              1. Time-Based Bucketing
                                                                                                                                                                                1. Size-Based Bucketing
                                                                                                                                                                                2. Partition Sizing
                                                                                                                                                                                  1. TTL Usage
                                                                                                                                                                                  2. Hierarchical Data
                                                                                                                                                                                    1. Tree Structures
                                                                                                                                                                                      1. Path-Based Modeling
                                                                                                                                                                                      2. Materialized Views
                                                                                                                                                                                        1. Use Cases
                                                                                                                                                                                          1. Automatic Maintenance
                                                                                                                                                                                            1. Limitations and Caveats
                                                                                                                                                                                              1. Performance Considerations
                                                                                                                                                                                            2. Secondary Indexes
                                                                                                                                                                                              1. Purpose and Use Cases
                                                                                                                                                                                                1. Index Types
                                                                                                                                                                                                  1. Simple Secondary Indexes
                                                                                                                                                                                                    1. Composite Secondary Indexes
                                                                                                                                                                                                    2. Performance Implications
                                                                                                                                                                                                      1. Query Performance
                                                                                                                                                                                                        1. Write Performance
                                                                                                                                                                                                        2. When to Avoid Secondary Indexes
                                                                                                                                                                                                          1. High Cardinality Columns
                                                                                                                                                                                                            1. Frequently Updated Columns
                                                                                                                                                                                                            2. SASI Indexes
                                                                                                                                                                                                              1. String Analysis
                                                                                                                                                                                                                1. Numeric Analysis
                                                                                                                                                                                                              2. Data Modeling Anti-Patterns
                                                                                                                                                                                                                1. Unbounded Row Growth
                                                                                                                                                                                                                  1. Causes and Consequences
                                                                                                                                                                                                                    1. Prevention Strategies
                                                                                                                                                                                                                    2. Large Partitions
                                                                                                                                                                                                                      1. Performance Impact
                                                                                                                                                                                                                        1. Size Limitations
                                                                                                                                                                                                                        2. Hot Partitions
                                                                                                                                                                                                                          1. Load Distribution Issues
                                                                                                                                                                                                                            1. Mitigation Strategies
                                                                                                                                                                                                                            2. Overuse of Secondary Indexes
                                                                                                                                                                                                                              1. Scalability Issues
                                                                                                                                                                                                                                1. Alternative Approaches
                                                                                                                                                                                                                                2. Using Cassandra as a Queue
                                                                                                                                                                                                                                  1. Problems with Queue Patterns
                                                                                                                                                                                                                                    1. Alternatives and Recommendations
                                                                                                                                                                                                                                    2. Excessive Denormalization
                                                                                                                                                                                                                                      1. Maintenance Overhead
                                                                                                                                                                                                                                        1. Consistency Challenges