Machine Learning with Apache Spark

  1. Building Machine Learning Pipelines
    1. The ML Pipeline Concept
      1. Motivation for Pipelines
        1. Standardization of ML Workflow
          1. Benefits of Pipeline Abstraction
            1. Pipeline vs Traditional Approaches
            2. Key Pipeline Components
              1. Transformer
                1. Definition and Interface
                  1. Built-in Transformers
                    1. Custom Transformer Creation
                    2. Estimator
                      1. Definition and Interface
                        1. Built-in Estimators
                          1. Custom Estimator Creation
                          2. Parameter Management
                            1. Parameter Definition
                              1. Parameter Usage
                                1. Parameter Validation
                                2. Parameter Maps
                                  1. Creating Parameter Maps
                                    1. Overriding Default Parameters
                                      1. Parameter Grid Construction
                                    2. Assembling a Pipeline
                                      1. Defining Pipeline Stages
                                        1. Sequential Stage Arrangement
                                          1. Stage Dependencies
                                            1. Conditional Stages
                                            2. Pipeline Fitting Process
                                              1. Training Data Requirements
                                                1. Model Training Workflow
                                                  1. Pipeline State Management
                                                  2. Making Predictions
                                                    1. Transforming New Data
                                                      1. Batch Prediction
                                                        1. Streaming Prediction
                                                        2. Pipeline Persistence
                                                          1. Saving Pipelines
                                                            1. Loading Pipelines
                                                              1. Version Management
                                                            2. Advanced Pipeline Concepts
                                                              1. Nested Pipelines
                                                                1. Pipeline Branching
                                                                  1. Dynamic Pipeline Construction
                                                                    1. Pipeline Debugging and Monitoring