Useful Links
Statistics
Statistics for Data Science
1. Foundations of Data and Statistics
2. Descriptive Statistics: Summarizing Data
3. Fundamentals of Probability
4. Probability Distributions
5. Inferential Statistics: From Samples to Populations
6. Hypothesis Testing
7. Regression Analysis for Prediction
8. Advanced and Modern Statistical Methods
Descriptive Statistics: Summarizing Data
Measures of Central Tendency
Mean
Arithmetic Mean
Calculation for Ungrouped Data
Calculation for Grouped Data
Weighted Mean
Properties of the Mean
Sensitivity to Outliers
Mathematical Properties
When to Use vs. Avoid
Alternative Means
Geometric Mean
Harmonic Mean
Trimmed Mean
Median
Calculation Methods
Calculation for Odd Data Sets
Calculation for Even Data Sets
Interpolation Methods
Properties of the Median
Robustness to Outliers
Positional Nature
When to Prefer Over Mean
Mode
Identification Methods
Unimodal Distributions
Bimodal Distributions
Multimodal Distributions
Applications of Mode
Categorical Data Analysis
Peak Identification
Distribution Shape Assessment
Choosing Appropriate Measures
Data Type Considerations
Distribution Shape Impact
Outlier Presence
Business Context Relevance
Measures of Variability and Dispersion
Range
Calculation and Interpretation
Limitations and Weaknesses
When Range is Useful
Interquartile Range (IQR)
Calculation Steps
First Quartile (Q1)
Third Quartile (Q3)
IQR Computation
Use in Outlier Detection
IQR Rule for Outliers
Box Plot Construction
Robustness Properties
Variance
Population Variance
Formula and Calculation
Degrees of Freedom Concept
Sample Variance
Bessel's Correction
Unbiased Estimation
Units and Interpretation
Squared Units Problem
Relative Magnitude Assessment
Standard Deviation
Relationship to Variance
Square Root Transformation
Unit Restoration
Interpretation in Context
Typical Deviation from Mean
Distribution Spread Assessment
Population vs. Sample Standard Deviation
Coefficient of Variation
Calculation and Formula
Use Cases and Applications
Relative Variability Comparison
Scale-Independent Comparison
Comparing Variability Across Datasets
Different Units Handling
Different Scales Normalization
Mean Absolute Deviation
Calculation and Properties
Comparison with Standard Deviation
Robustness Characteristics
Measures of Position
Percentiles
Definition and Concept
Calculation Methods
Linear Interpolation
Nearest Rank Method
Interpretation and Applications
Performance Benchmarking
Distribution Analysis
Applications in Data Science
Feature Scaling
Outlier Detection Thresholds
Quartiles
First Quartile (Q1)
25th Percentile
Lower Quartile Interpretation
Second Quartile (Q2)
Median Relationship
50th Percentile
Third Quartile (Q3)
75th Percentile
Upper Quartile Interpretation
Five-Number Summary
Minimum Value
Q1, Median, Q3
Maximum Value
Box Plot Foundation
Z-scores (Standard Scores)
Standardization Formula
Calculation Process
Interpretation Guidelines
Distance from Mean
Standard Deviation Units
Applications
Identifying Outliers
Comparing Across Distributions
Data Normalization
Deciles and Other Quantiles
Decile Calculations
Custom Quantile Selection
Business Applications
Understanding Data Shape
Skewness
Definition and Measurement
Positive Skew (Right-Skewed)
Characteristics and Examples
Tail Direction
Mean vs. Median Relationship
Negative Skew (Left-Skewed)
Characteristics and Examples
Tail Direction
Mean vs. Median Relationship
Symmetrical Distributions
Impact on Statistical Measures
Central Tendency Measures
Variability Measures
Inference Implications
Skewness Coefficients
Pearson's Skewness
Sample Skewness Formula
Kurtosis
Definition and Measurement
Types of Kurtosis
Leptokurtic Distributions
Mesokurtic Distributions
Platykurtic Distributions
Excess Kurtosis
Comparison to Normal Distribution
Interpretation Guidelines
Interpretation in Data Analysis
Tail Behavior Assessment
Outlier Propensity
Risk Assessment Applications
Distribution Comparison
Normal Distribution Benchmarking
Empirical vs. Theoretical Distributions
Goodness-of-Fit Assessment
Data Visualization for EDA
Histograms
Construction Principles
Bin Selection Strategies
Frequency vs. Density
Choosing Bin Widths
Sturges' Rule
Scott's Rule
Freedman-Diaconis Rule
Interpretation Guidelines
Shape Assessment
Outlier Identification
Distribution Comparison
Box Plots (Box-and-Whisker Plots)
Components of a Box Plot
Box Construction
Whisker Calculation
Outlier Marking
Variations
Notched Box Plots
Violin Plots
Multiple Box Plots
Identifying Outliers
IQR Method
Visual Identification
Statistical vs. Practical Outliers
Bar Charts
Categorical Data Visualization
Frequency Representation
Proportion Display
Chart Variations
Grouped Bar Charts
Stacked Bar Charts
Horizontal vs. Vertical
Best Practices
Ordering Strategies
Color Usage
Label Clarity
Scatter Plots
Construction Principles
Variable Assignment
Point Representation
Visualizing Relationships
Linear Relationships
Non-Linear Patterns
No Relationship Patterns
Detecting Correlation and Patterns
Positive Correlation
Negative Correlation
Correlation Strength Assessment
Enhancements
Color Coding
Size Mapping
Trend Lines
Density Plots
Kernel Density Estimation
Bandwidth Selection
Kernel Function Types
Smoothing Concepts
Comparison with Histograms
Continuous vs. Discrete Representation
Smoothness vs. Granularity
Interpretation Differences
Multiple Distribution Comparison
Additional Visualization Types
Stem-and-Leaf Plots
Dot Plots
Q-Q Plots
Heat Maps for Correlation
Previous
1. Foundations of Data and Statistics
Go to top
Next
3. Fundamentals of Probability