Big Data Technologies

  1. Stream Processing Technologies
    1. Fundamentals of Data Streaming
      1. Bounded vs. Unbounded Data
        1. Batch vs. Streaming
          1. Data Characteristics
          2. Event Time vs. Processing Time
            1. Timestamp Semantics
              1. Watermarks
                1. Late Data Handling
                  1. Out-of-Order Events
                  2. Windowing Operations
                    1. Tumbling Windows
                      1. Sliding Windows
                        1. Session Windows
                          1. Custom Windows
                          2. Stream Processing Patterns
                            1. Filtering
                              1. Aggregation
                                1. Joining Streams
                                  1. Pattern Detection
                                2. Apache Kafka
                                  1. Core Concepts
                                    1. Topics and Partitions
                                      1. Topic Creation
                                        1. Partitioning Strategies
                                          1. Partition Keys
                                          2. Producers and Consumers
                                            1. Producer API
                                              1. Consumer Groups
                                                1. Consumer Offsets
                                                2. Brokers and Clusters
                                                  1. Broker Roles
                                                    1. Cluster Coordination
                                                      1. Leader Election
                                                      2. Zookeeper's Role
                                                        1. Metadata Management
                                                          1. Configuration Management
                                                            1. Coordination Services
                                                          2. Kafka Architecture
                                                            1. Log-Structured Storage
                                                              1. Message Retention
                                                                1. Offset Management
                                                                  1. Compaction
                                                                  2. Consumer Groups
                                                                    1. Load Balancing
                                                                      1. Partition Assignment
                                                                        1. Rebalancing
                                                                        2. Delivery Semantics
                                                                          1. At Most Once
                                                                            1. At Least Once
                                                                              1. Exactly Once
                                                                                1. Idempotent Producers
                                                                              2. Kafka Configuration and Tuning
                                                                                1. Broker Configuration
                                                                                  1. Producer Configuration
                                                                                    1. Consumer Configuration
                                                                                      1. Performance Tuning
                                                                                      2. Kafka Ecosystem
                                                                                        1. Kafka Connect
                                                                                          1. Source Connectors
                                                                                            1. Sink Connectors
                                                                                              1. Connector Development
                                                                                              2. Kafka Streams
                                                                                                1. Stream Processing API
                                                                                                  1. State Stores
                                                                                                    1. Topology Design
                                                                                                    2. ksqlDB
                                                                                                      1. SQL-Based Stream Processing
                                                                                                        1. Materialized Views
                                                                                                      2. Kafka Operations
                                                                                                        1. Cluster Management
                                                                                                          1. Monitoring and Metrics
                                                                                                            1. Security Configuration
                                                                                                              1. Disaster Recovery
                                                                                                            2. Other Stream Processing Frameworks
                                                                                                              1. Apache Storm
                                                                                                                1. Topology Design
                                                                                                                  1. Real-Time Processing
                                                                                                                    1. Spouts and Bolts
                                                                                                                    2. Amazon Kinesis
                                                                                                                      1. Data Streams
                                                                                                                        1. Data Firehose
                                                                                                                          1. Analytics
                                                                                                                          2. Apache Pulsar
                                                                                                                            1. Multi-Tenant Architecture
                                                                                                                              1. Geo-Replication
                                                                                                                                1. Functions