Machine Learning with Apache Spark

  1. Unsupervised Learning with Spark ML
    1. Clustering Algorithms
      1. K-Means Clustering
        1. Algorithm Overview
          1. Initialization Methods
            1. Random Initialization
              1. K-Means++
              2. Distance Metrics
                1. Convergence Criteria
                  1. Choosing Optimal K
                    1. Elbow Method
                      1. Silhouette Analysis
                      2. Cluster Evaluation
                        1. Handling Large Datasets
                        2. Latent Dirichlet Allocation
                          1. Topic Modeling Fundamentals
                            1. Probabilistic Model
                              1. Dirichlet Distribution
                                1. Gibbs Sampling
                                  1. Variational Inference
                                    1. Hyperparameter Tuning
                                      1. Alpha Parameter
                                        1. Beta Parameter
                                          1. Number of Topics
                                          2. Topic Interpretation
                                          3. Gaussian Mixture Model
                                            1. Mixture Model Concept
                                              1. Soft Clustering Approach
                                                1. Expectation-Maximization Algorithm
                                                  1. Component Selection
                                                    1. Probability Assignments
                                                      1. Model Selection Criteria
                                                      2. Bisecting K-Means
                                                        1. Hierarchical Clustering Approach
                                                          1. Divisive Clustering
                                                            1. Tree Structure
                                                              1. Stopping Criteria
                                                            2. Dimensionality Reduction
                                                              1. Principal Component Analysis
                                                                1. Covariance Matrix Computation
                                                                  1. Eigenvalue Decomposition
                                                                    1. Principal Component Selection
                                                                      1. Variance Explained
                                                                        1. Feature Space Reduction
                                                                          1. Interpreting Components
                                                                            1. Reconstruction Error
                                                                            2. Singular Value Decomposition
                                                                              1. Matrix Factorization Approach
                                                                                1. Low-Rank Approximation
                                                                                  1. Computational Efficiency
                                                                                    1. Applications in ML
                                                                                  2. Association Rule Mining
                                                                                    1. FP-Growth Algorithm
                                                                                      1. Frequent Pattern Mining
                                                                                        1. FP-Tree Construction
                                                                                          1. Conditional Pattern Bases
                                                                                            1. Support Threshold
                                                                                              1. Confidence Threshold
                                                                                                1. Lift Calculation
                                                                                                  1. Association Rule Generation
                                                                                                    1. Market Basket Analysis