Useful Links
Computer Science
Data Science
Data Science
1. Foundations of Data Science
2. Mathematical and Statistical Foundations
3. Computational Foundations and Tools
4. Data Acquisition and Management
5. Exploratory Data Analysis
6. Feature Engineering and Selection
7. Machine Learning Fundamentals
8. Advanced Machine Learning Topics
9. Big Data and Distributed Computing
10. Data Visualization and Communication
11. Model Deployment and MLOps
12. Ethics and Responsible AI
Big Data and Distributed Computing
Big Data Concepts
The Four Vs of Big Data
Volume
Velocity
Variety
Veracity
Big Data Challenges
Storage
Processing
Analysis
Visualization
Big Data Architecture
Lambda Architecture
Kappa Architecture
Data Lake vs Data Warehouse
Distributed Computing Fundamentals
Parallel vs Distributed Computing
CAP Theorem
Consistency Models
Fault Tolerance
Load Balancing
Hadoop Ecosystem
Hadoop Distributed File System
Architecture
Data Replication
Fault Tolerance
MapReduce
Programming Model
Job Execution
Optimization Techniques
YARN
Resource Management
Application Scheduling
Hadoop Ecosystem Tools
Hive
Pig
HBase
Sqoop
Flume
Apache Spark
Spark Architecture
Driver and Executors
Cluster Managers
Memory Management
Resilient Distributed Datasets
RDD Operations
Transformations vs Actions
Lazy Evaluation
Caching and Persistence
Spark SQL
DataFrames and Datasets
Catalyst Optimizer
SQL Interface
Spark MLlib
Machine Learning Pipelines
Feature Engineering
Model Training and Evaluation
Spark Streaming
DStreams
Structured Streaming
Real-time Processing
NoSQL Databases
Document Stores
MongoDB
CouchDB
Key-Value Stores
Redis
Amazon DynamoDB
Column-Family
Apache Cassandra
HBase
Graph Databases
Neo4j
Amazon Neptune
Stream Processing
Apache Kafka
Topics and Partitions
Producers and Consumers
Stream Processing
Apache Storm
Topology Design
Spouts and Bolts
Apache Flink
DataStream API
Event Time Processing
Windowing
Previous
8. Advanced Machine Learning Topics
Go to top
Next
10. Data Visualization and Communication