Feature Engineering for Machine Learning

  1. Data Cleaning and Preparation
    1. Missing Value Handling
      1. Missing Data Assessment
        1. Missingness Patterns
          1. Missing Data Visualization
            1. Missingness Mechanisms
              1. Missing Completely at Random
                1. Missing at Random
                  1. Missing Not at Random
                2. Deletion Strategies
                  1. Listwise Deletion
                    1. Pairwise Deletion
                      1. Feature Deletion
                      2. Imputation Methods
                        1. Simple Imputation
                          1. Mean Imputation
                            1. Median Imputation
                              1. Mode Imputation
                                1. Constant Value Imputation
                                  1. Forward Fill
                                    1. Backward Fill
                                    2. Advanced Imputation
                                      1. K-Nearest Neighbors Imputation
                                        1. Iterative Imputation
                                          1. Multiple Imputation
                                            1. Regression Imputation
                                              1. Matrix Factorization
                                            2. Missingness Indicators
                                              1. Binary Missing Flags
                                                1. Missing Pattern Features
                                              2. Outlier Management
                                                1. Outlier Definition and Causes
                                                  1. Statistical Outliers
                                                    1. Domain-Specific Outliers
                                                      1. Data Entry Errors
                                                        1. Natural Extreme Values
                                                        2. Detection Methods
                                                          1. Statistical Approaches
                                                            1. Z-Score Method
                                                              1. Modified Z-Score
                                                                1. IQR Method
                                                                  1. Grubbs Test
                                                                  2. Visualization Techniques
                                                                    1. Box Plots
                                                                      1. Scatter Plots
                                                                        1. Residual Plots
                                                                        2. Machine Learning Methods
                                                                          1. Isolation Forest
                                                                            1. Local Outlier Factor
                                                                              1. One-Class SVM
                                                                                1. DBSCAN Clustering
                                                                              2. Treatment Strategies
                                                                                1. Removal Techniques
                                                                                  1. Transformation Methods
                                                                                    1. Capping and Winsorization
                                                                                      1. Robust Statistics
                                                                                        1. Separate Category Creation