Machine Learning with Apache Spark

  1. Model Evaluation and Hyperparameter Tuning
    1. Evaluation Metrics
      1. Classification Metrics
        1. Accuracy
          1. Precision
            1. Recall
              1. F1-Score
                1. Specificity
                  1. Area Under ROC Curve
                    1. Area Under Precision-Recall Curve
                      1. Confusion Matrix Analysis
                        1. Multi-class Metrics
                          1. Macro Averaging
                            1. Micro Averaging
                              1. Weighted Averaging
                            2. Regression Metrics
                              1. Mean Squared Error
                                1. Root Mean Squared Error
                                  1. Mean Absolute Error
                                    1. Mean Absolute Percentage Error
                                      1. R-squared
                                        1. Adjusted R-squared
                                          1. Residual Analysis
                                          2. Clustering Metrics
                                            1. Silhouette Score
                                              1. Within-Cluster Sum of Squares
                                                1. Between-Cluster Sum of Squares
                                                  1. Calinski-Harabasz Index
                                                    1. Davies-Bouldin Index
                                                    2. Ranking Metrics
                                                      1. Precision at K
                                                        1. Recall at K
                                                          1. Mean Average Precision
                                                            1. Normalized Discounted Cumulative Gain
                                                          2. Model Validation Techniques
                                                            1. Train-Test Split
                                                              1. Random Splitting
                                                                1. Stratified Splitting
                                                                  1. Time-Based Splitting
                                                                  2. Cross-Validation
                                                                    1. K-Fold Cross-Validation
                                                                      1. Stratified K-Fold
                                                                        1. Time Series Cross-Validation
                                                                          1. Leave-One-Out Cross-Validation
                                                                          2. Train-Validation-Test Split
                                                                            1. Three-Way Data Splitting
                                                                              1. Validation Set Usage
                                                                                1. Final Model Evaluation
                                                                              2. Hyperparameter Tuning
                                                                                1. Hyperparameter Concepts
                                                                                  1. Model Parameters vs Hyperparameters
                                                                                    1. Hyperparameter Space
                                                                                      1. Tuning Objectives
                                                                                      2. Search Strategies
                                                                                        1. Grid Search
                                                                                          1. ParamGridBuilder Usage
                                                                                            1. Exhaustive Search
                                                                                              1. Computational Complexity
                                                                                              2. Random Search
                                                                                                1. Random Sampling
                                                                                                  1. Efficiency Benefits
                                                                                                    1. Implementation Approaches
                                                                                                    2. Bayesian Optimization
                                                                                                      1. Evolutionary Algorithms
                                                                                                      2. Cross-Validation Integration
                                                                                                        1. CrossValidator Class
                                                                                                          1. TrainValidationSplit Class
                                                                                                            1. Evaluation Metrics Selection
                                                                                                            2. Advanced Tuning Techniques
                                                                                                              1. Early Stopping
                                                                                                                1. Learning Curves
                                                                                                                  1. Validation Curves
                                                                                                                    1. Hyperparameter Importance
                                                                                                                    2. Overfitting and Underfitting
                                                                                                                      1. Bias-Variance Tradeoff
                                                                                                                        1. Model Complexity Control
                                                                                                                          1. Regularization Techniques
                                                                                                                            1. Diagnostic Methods