Useful Links
Computer Science
Big Data
Big Data Technologies
1. Introduction to Big Data
2. Core Principles of Distributed Systems
3. The Hadoop Ecosystem
4. Modern Data Processing with Apache Spark
5. Stream Processing Technologies
6. NoSQL Databases
7. Data Warehousing and Analytics on Big Data
8. Cloud-Based Big Data Platforms
9. Supporting Ecosystem and Tools
10. Big Data Governance and Security
11. Performance Optimization and Best Practices
12. Emerging Trends and Future Directions
Stream Processing Technologies
Fundamentals of Data Streaming
Bounded vs. Unbounded Data
Batch vs. Streaming
Data Characteristics
Event Time vs. Processing Time
Timestamp Semantics
Watermarks
Late Data Handling
Out-of-Order Events
Windowing Operations
Tumbling Windows
Sliding Windows
Session Windows
Custom Windows
Stream Processing Patterns
Filtering
Aggregation
Joining Streams
Pattern Detection
Apache Kafka
Core Concepts
Topics and Partitions
Topic Creation
Partitioning Strategies
Partition Keys
Producers and Consumers
Producer API
Consumer Groups
Consumer Offsets
Brokers and Clusters
Broker Roles
Cluster Coordination
Leader Election
Zookeeper's Role
Metadata Management
Configuration Management
Coordination Services
Kafka Architecture
Log-Structured Storage
Message Retention
Offset Management
Compaction
Consumer Groups
Load Balancing
Partition Assignment
Rebalancing
Delivery Semantics
At Most Once
At Least Once
Exactly Once
Idempotent Producers
Kafka Configuration and Tuning
Broker Configuration
Producer Configuration
Consumer Configuration
Performance Tuning
Kafka Ecosystem
Kafka Connect
Source Connectors
Sink Connectors
Connector Development
Kafka Streams
Stream Processing API
State Stores
Topology Design
ksqlDB
SQL-Based Stream Processing
Materialized Views
Kafka Operations
Cluster Management
Monitoring and Metrics
Security Configuration
Disaster Recovery
Other Stream Processing Frameworks
Apache Flink
Event-Driven Processing
State Management
Checkpointing
Watermarks and Windows
Apache Storm
Topology Design
Real-Time Processing
Spouts and Bolts
Amazon Kinesis
Data Streams
Data Firehose
Analytics
Apache Pulsar
Multi-Tenant Architecture
Geo-Replication
Functions
Previous
4. Modern Data Processing with Apache Spark
Go to top
Next
6. NoSQL Databases