Statistics for Data Science

  1. Probability Distributions
    1. Random Variables
      1. Definition and Concept
        1. Mapping from Sample Space to Real Numbers
          1. Notation and Conventions
            1. Types of Random Variables
            2. Discrete Random Variables
              1. Definition and Characteristics
                1. Countable Outcomes
                  1. Finite or Countably Infinite Range
                  2. Probability Mass Function (PMF)
                    1. Definition and Properties
                      1. PMF Requirements
                        1. Graphical Representation
                        2. Cumulative Distribution Function (CDF)
                          1. Definition for Discrete Variables
                            1. Step Function Characteristics
                              1. Relationship to PMF
                              2. Expected Value (Mean)
                                1. Definition and Calculation
                                  1. Linearity of Expectation
                                    1. Interpretation
                                    2. Variance and Standard Deviation
                                      1. Definition and Calculation
                                        1. Alternative Formulas
                                      2. Continuous Random Variables
                                        1. Definition and Characteristics
                                          1. Uncountable Outcomes
                                            1. Continuous Range
                                            2. Probability Density Function (PDF)
                                              1. Definition and Properties
                                                1. Area Under Curve Interpretation
                                                  1. PDF Requirements
                                                  2. Cumulative Distribution Function (CDF)
                                                    1. Definition for Continuous Variables
                                                      1. Relationship to PDF
                                                        1. Properties and Characteristics
                                                        2. Expected Value (Mean)
                                                          1. Integration Formula
                                                            1. Interpretation
                                                            2. Variance and Standard Deviation
                                                              1. Integration Formulas
                                                            3. Functions of Random Variables
                                                              1. Linear Transformations
                                                                1. Expected Value of Functions
                                                                  1. Variance of Transformations
                                                                2. Discrete Probability Distributions
                                                                  1. Bernoulli Distribution
                                                                    1. Definition and Context
                                                                      1. Single Trial Success/Failure
                                                                        1. Parameter p
                                                                        2. Probability Mass Function
                                                                          1. Properties
                                                                            1. Mean (Expected Value)
                                                                              1. Variance
                                                                                1. Standard Deviation
                                                                                2. Applications
                                                                                  1. Binary Outcomes
                                                                                    1. Foundation for Other Distributions
                                                                                  2. Binomial Distribution
                                                                                    1. Definition and Context
                                                                                      1. Fixed Number of Independent Trials
                                                                                        1. Constant Success Probability
                                                                                        2. Parameters
                                                                                          1. Number of Trials (n)
                                                                                            1. Success Probability (p)
                                                                                            2. Probability Mass Function
                                                                                              1. Formula and Calculation
                                                                                                1. Combinatorial Component
                                                                                                2. Properties
                                                                                                  1. Mean (np)
                                                                                                    1. Variance (np(1-p))
                                                                                                      1. Standard Deviation
                                                                                                      2. Applications
                                                                                                        1. Quality Control
                                                                                                          1. Survey Sampling
                                                                                                            1. A/B Testing
                                                                                                            2. Normal Approximation
                                                                                                              1. Conditions for Approximation
                                                                                                                1. Continuity Correction
                                                                                                              2. Poisson Distribution
                                                                                                                1. Definition and Context
                                                                                                                  1. Rare Events in Fixed Intervals
                                                                                                                    1. Rate Parameter λ
                                                                                                                    2. Probability Mass Function
                                                                                                                      1. Formula and Calculation
                                                                                                                        1. e^(-λ) Component
                                                                                                                        2. Properties
                                                                                                                          1. Mean (λ)
                                                                                                                            1. Variance (λ)
                                                                                                                              1. Standard Deviation
                                                                                                                              2. Poisson Process
                                                                                                                                1. Assumptions and Conditions
                                                                                                                                  1. Time/Space Intervals
                                                                                                                                  2. Poisson Approximation to Binomial
                                                                                                                                    1. Conditions (large n, small p)
                                                                                                                                      1. λ = np Relationship
                                                                                                                                      2. Applications
                                                                                                                                        1. Call Center Arrivals
                                                                                                                                          1. Defect Counting
                                                                                                                                            1. Website Traffic
                                                                                                                                          2. Geometric Distribution
                                                                                                                                            1. Definition and Context
                                                                                                                                              1. First Success in Sequence
                                                                                                                                                1. Independent Trials
                                                                                                                                                2. Probability Mass Function
                                                                                                                                                  1. Formula and Calculation
                                                                                                                                                    1. (1-p)^(k-1) * p Structure
                                                                                                                                                    2. Properties
                                                                                                                                                      1. Mean (1/p)
                                                                                                                                                        1. Variance ((1-p)/p²)
                                                                                                                                                          1. Memoryless Property
                                                                                                                                                          2. Applications
                                                                                                                                                            1. Waiting Time Problems
                                                                                                                                                              1. Reliability Analysis
                                                                                                                                                            2. Negative Binomial Distribution
                                                                                                                                                              1. Definition and Context
                                                                                                                                                                1. Parameters and PMF
                                                                                                                                                                  1. Relationship to Geometric Distribution
                                                                                                                                                                  2. Hypergeometric Distribution
                                                                                                                                                                    1. Definition and Context
                                                                                                                                                                      1. Sampling Without Replacement
                                                                                                                                                                        1. Parameters and PMF
                                                                                                                                                                      2. Continuous Probability Distributions
                                                                                                                                                                        1. Uniform Distribution
                                                                                                                                                                          1. Definition and Context
                                                                                                                                                                            1. Equal Probability Over Interval
                                                                                                                                                                              1. Parameters a and b
                                                                                                                                                                              2. Probability Density Function
                                                                                                                                                                                1. Rectangular Shape
                                                                                                                                                                                  1. Height = 1/(b-a)
                                                                                                                                                                                  2. Cumulative Distribution Function
                                                                                                                                                                                    1. Linear Function
                                                                                                                                                                                      1. Calculation Methods
                                                                                                                                                                                      2. Properties
                                                                                                                                                                                        1. Mean ((a+b)/2)
                                                                                                                                                                                          1. Variance ((b-a)²/12)
                                                                                                                                                                                          2. Applications
                                                                                                                                                                                            1. Random Number Generation
                                                                                                                                                                                              1. Modeling Uncertainty
                                                                                                                                                                                            2. Normal (Gaussian) Distribution
                                                                                                                                                                                              1. Definition and Importance
                                                                                                                                                                                                1. Bell-Shaped Curve
                                                                                                                                                                                                  1. Central Limit Theorem Connection
                                                                                                                                                                                                  2. Parameters
                                                                                                                                                                                                    1. Mean (μ)
                                                                                                                                                                                                      1. Standard Deviation (σ)
                                                                                                                                                                                                      2. Probability Density Function
                                                                                                                                                                                                        1. Mathematical Formula
                                                                                                                                                                                                          1. Exponential Component
                                                                                                                                                                                                          2. Properties and Characteristics
                                                                                                                                                                                                            1. Symmetry About Mean
                                                                                                                                                                                                              1. Inflection Points
                                                                                                                                                                                                                1. Asymptotic Behavior
                                                                                                                                                                                                                2. The Standard Normal Distribution
                                                                                                                                                                                                                  1. Z-Distribution (μ=0, σ=1)
                                                                                                                                                                                                                    1. Standardization Process
                                                                                                                                                                                                                      1. Z-Score Calculation
                                                                                                                                                                                                                        1. Z-Table Usage
                                                                                                                                                                                                                          1. Reading Probabilities
                                                                                                                                                                                                                            1. Finding Critical Values
                                                                                                                                                                                                                              1. Interpolation Methods
                                                                                                                                                                                                                            2. The 68-95-99.7 Rule
                                                                                                                                                                                                                              1. Empirical Rule Statement
                                                                                                                                                                                                                                1. One Standard Deviation (68%)
                                                                                                                                                                                                                                  1. Two Standard Deviations (95%)
                                                                                                                                                                                                                                    1. Three Standard Deviations (99.7%)
                                                                                                                                                                                                                                      1. Applications in Quality Control
                                                                                                                                                                                                                                        1. Outlier Detection Guidelines
                                                                                                                                                                                                                                        2. Applications
                                                                                                                                                                                                                                          1. Natural Phenomena Modeling
                                                                                                                                                                                                                                            1. Measurement Errors
                                                                                                                                                                                                                                              1. Statistical Inference Foundation
                                                                                                                                                                                                                                            2. Exponential Distribution
                                                                                                                                                                                                                                              1. Definition and Context
                                                                                                                                                                                                                                                1. Continuous Analog of Geometric
                                                                                                                                                                                                                                                  1. Rate Parameter λ
                                                                                                                                                                                                                                                  2. Probability Density Function
                                                                                                                                                                                                                                                    1. λe^(-λx) Formula
                                                                                                                                                                                                                                                      1. Decreasing Function
                                                                                                                                                                                                                                                      2. Cumulative Distribution Function
                                                                                                                                                                                                                                                        1. 1 - e^(-λx) Formula
                                                                                                                                                                                                                                                        2. Properties
                                                                                                                                                                                                                                                          1. Mean (1/λ)
                                                                                                                                                                                                                                                            1. Variance (1/λ²)
                                                                                                                                                                                                                                                              1. Memoryless Property
                                                                                                                                                                                                                                                                1. Mathematical Expression
                                                                                                                                                                                                                                                                  1. Practical Implications
                                                                                                                                                                                                                                                                2. Applications
                                                                                                                                                                                                                                                                  1. Reliability Engineering
                                                                                                                                                                                                                                                                    1. Queuing Theory
                                                                                                                                                                                                                                                                      1. Survival Analysis
                                                                                                                                                                                                                                                                    2. t-Distribution (Student's t)
                                                                                                                                                                                                                                                                      1. Definition and Context
                                                                                                                                                                                                                                                                        1. Small Sample Inference
                                                                                                                                                                                                                                                                          1. Unknown Population Variance
                                                                                                                                                                                                                                                                          2. Degrees of Freedom Parameter
                                                                                                                                                                                                                                                                            1. Relationship to Sample Size
                                                                                                                                                                                                                                                                              1. Effect on Distribution Shape
                                                                                                                                                                                                                                                                              2. Properties
                                                                                                                                                                                                                                                                                1. Symmetry About Zero
                                                                                                                                                                                                                                                                                  1. Heavier Tails than Normal
                                                                                                                                                                                                                                                                                    1. Convergence to Normal
                                                                                                                                                                                                                                                                                    2. Comparison with Normal Distribution
                                                                                                                                                                                                                                                                                      1. Shape Differences
                                                                                                                                                                                                                                                                                        1. Tail Behavior
                                                                                                                                                                                                                                                                                          1. When to Use Each
                                                                                                                                                                                                                                                                                          2. Applications
                                                                                                                                                                                                                                                                                            1. Confidence Intervals
                                                                                                                                                                                                                                                                                              1. Hypothesis Testing
                                                                                                                                                                                                                                                                                                1. Small Sample Problems
                                                                                                                                                                                                                                                                                              2. Chi-Square Distribution
                                                                                                                                                                                                                                                                                                1. Definition and Context
                                                                                                                                                                                                                                                                                                  1. Degrees of Freedom
                                                                                                                                                                                                                                                                                                    1. Properties and Shape
                                                                                                                                                                                                                                                                                                      1. Applications in Testing
                                                                                                                                                                                                                                                                                                      2. F-Distribution
                                                                                                                                                                                                                                                                                                        1. Definition and Context
                                                                                                                                                                                                                                                                                                          1. Two Degrees of Freedom Parameters
                                                                                                                                                                                                                                                                                                            1. Applications in ANOVA
                                                                                                                                                                                                                                                                                                          2. The Central Limit Theorem (CLT)
                                                                                                                                                                                                                                                                                                            1. Statement and Mathematical Formulation
                                                                                                                                                                                                                                                                                                              1. Sample Mean Distribution
                                                                                                                                                                                                                                                                                                                1. Standardization Formula
                                                                                                                                                                                                                                                                                                                  1. Convergence to Normality
                                                                                                                                                                                                                                                                                                                  2. Conditions and Assumptions
                                                                                                                                                                                                                                                                                                                    1. Independent Observations
                                                                                                                                                                                                                                                                                                                      1. Identically Distributed
                                                                                                                                                                                                                                                                                                                        1. Finite Variance Requirement
                                                                                                                                                                                                                                                                                                                        2. Intuition and Explanation
                                                                                                                                                                                                                                                                                                                          1. Averaging Effect
                                                                                                                                                                                                                                                                                                                            1. Cancellation of Deviations
                                                                                                                                                                                                                                                                                                                              1. Universal Phenomenon
                                                                                                                                                                                                                                                                                                                              2. Sample Size Considerations
                                                                                                                                                                                                                                                                                                                                1. Rule of Thumb (n ≥ 30)
                                                                                                                                                                                                                                                                                                                                  1. Population Distribution Impact
                                                                                                                                                                                                                                                                                                                                    1. Skewness and Required Sample Size
                                                                                                                                                                                                                                                                                                                                    2. Significance and Applications
                                                                                                                                                                                                                                                                                                                                      1. Foundation of Inferential Statistics
                                                                                                                                                                                                                                                                                                                                        1. Confidence Interval Construction
                                                                                                                                                                                                                                                                                                                                          1. Hypothesis Testing Basis
                                                                                                                                                                                                                                                                                                                                            1. Quality Control Applications
                                                                                                                                                                                                                                                                                                                                            2. Sampling Distribution of the Mean
                                                                                                                                                                                                                                                                                                                                              1. Mean of Sample Means
                                                                                                                                                                                                                                                                                                                                                1. Standard Error Formula
                                                                                                                                                                                                                                                                                                                                                  1. Relationship to Population Parameters
                                                                                                                                                                                                                                                                                                                                                  2. Law of Large Numbers
                                                                                                                                                                                                                                                                                                                                                    1. Weak Law of Large Numbers
                                                                                                                                                                                                                                                                                                                                                      1. Strong Law of Large Numbers
                                                                                                                                                                                                                                                                                                                                                        1. Relationship to CLT
                                                                                                                                                                                                                                                                                                                                                          1. Practical Implications
                                                                                                                                                                                                                                                                                                                                                          2. Extensions and Variations
                                                                                                                                                                                                                                                                                                                                                            1. CLT for Proportions
                                                                                                                                                                                                                                                                                                                                                              1. CLT for Other Statistics
                                                                                                                                                                                                                                                                                                                                                                1. Finite Population Correction