Useful Links
1. Introduction to Apache Spark
2. Core Spark Concepts
3. Spark Architecture and Execution
4. Spark SQL and Structured APIs
5. Structured Streaming
6. Machine Learning with MLlib
7. Graph Processing with GraphX
8. Performance Tuning and Optimization
  1. Computer Science
  2. Big Data

Apache Spark

1. Introduction to Apache Spark
2. Core Spark Concepts
3. Spark Architecture and Execution
4. Spark SQL and Structured APIs
5. Structured Streaming
6. Machine Learning with MLlib
7. Graph Processing with GraphX
8. Performance Tuning and Optimization
  1. Spark Architecture and Execution
    1. Job Execution Model
      1. Job Lifecycle
        1. Job Definition
          1. Job Submission
            1. Job Completion
            2. Stage Creation
              1. Stage Boundaries
                1. Shuffle Dependencies
                  1. Stage Scheduling
                  2. Task Management
                    1. Task Creation
                      1. Task Assignment
                        1. Task Execution
                          1. Task Recovery
                        2. Directed Acyclic Graph
                          1. DAG Construction
                            1. Transformation Graph Building
                              1. Dependency Analysis
                              2. DAG Scheduler
                                1. Stage Division Logic
                                  1. Optimization Strategies
                                    1. Fault Recovery Planning
                                  2. Task Scheduling
                                    1. Task Scheduler Components
                                      1. Task Assignment Algorithms
                                        1. Locality Preferences
                                          1. Resource Allocation
                                          2. Execution Flow
                                            1. Task Serialization
                                              1. Result Collection
                                                1. Failure Handling
                                              2. Cluster Deployment Options
                                                1. Standalone Cluster Mode
                                                  1. Master-Worker Architecture
                                                    1. Resource Management
                                                      1. Configuration Options
                                                      2. YARN Integration
                                                        1. Resource Manager Integration
                                                          1. Container Management
                                                            1. Security Features
                                                            2. Mesos Integration
                                                              1. Framework Registration
                                                                1. Resource Offers
                                                                  1. Fine-Grained vs Coarse-Grained
                                                                  2. Kubernetes Integration
                                                                    1. Pod Management
                                                                      1. Dynamic Allocation
                                                                        1. Container Orchestration
                                                                      2. Deployment Modes
                                                                        1. Client Mode
                                                                          1. Driver Location
                                                                            1. Network Requirements
                                                                              1. Use Case Scenarios
                                                                              2. Cluster Mode
                                                                                1. Driver Deployment
                                                                                  1. Resource Isolation
                                                                                    1. Production Considerations

                                                                                Previous

                                                                                2. Core Spark Concepts

                                                                                Go to top

                                                                                Next

                                                                                4. Spark SQL and Structured APIs

                                                                                © 2025 Useful Links. All rights reserved.

                                                                                About•Bluesky•X.com