Machine Learning with Scikit-Learn

  1. Building Machine Learning Pipelines
    1. Pipeline Concepts
      1. Benefits of Pipelines
        1. Data Leakage Prevention
          1. Code Organization
            1. Reproducibility
            2. The Pipeline Class
              1. Creating Pipelines
                1. Pipeline Steps
                  1. Chaining Transformers and Estimators
                    1. Named Steps
                      1. Pipeline Methods
                      2. The ColumnTransformer
                        1. Column-specific Transformations
                          1. Handling Mixed Data Types
                            1. Remainder Parameter
                              1. Sparse Matrix Handling
                              2. Feature Union
                                1. Parallel Feature Processing
                                  1. Combining Features
                                  2. Pipeline with Preprocessing
                                    1. Standard Preprocessing Pipeline
                                      1. Categorical and Numerical Features
                                        1. Missing Value Handling
                                        2. Pipeline with Feature Selection
                                          1. Integrated Feature Selection
                                            1. SelectFromModel
                                              1. RFE in Pipelines
                                              2. Pipeline Hyperparameter Tuning
                                                1. Parameter Naming Conventions
                                                  1. Grid Search with Pipelines
                                                    1. Nested Cross-Validation
                                                    2. Custom Transformers
                                                      1. Creating Custom Transformers
                                                        1. BaseEstimator and TransformerMixin
                                                          1. fit and transform Methods
                                                          2. Pipeline Persistence
                                                            1. Saving Complete Pipelines
                                                              1. Loading and Using Saved Pipelines
                                                              2. Pipeline Debugging
                                                                1. Intermediate Results
                                                                  1. Step-by-step Execution
                                                                    1. Memory Usage