Useful Links
Computer Science
Data Science
Data Cleaning
1. Introduction to Data Cleaning
2. Core Concepts of Data Quality
3. The Data Cleaning Workflow
4. Common Types of Data Quality Issues
5. Techniques for Handling Missing Data
6. Techniques for Correcting Inaccurate Data
7. Techniques for Standardization and Consistency
8. Techniques for Fixing Structural Errors
9. Tools and Technologies for Data Cleaning
10. Advanced Data Cleaning Topics
11. Best Practices and Documentation
Core Concepts of Data Quality
Dimensions of Data Quality
Accuracy
Correctness of Values
Precision of Measurements
Factual Accuracy
Completeness
Missing Value Assessment
Coverage Analysis
Required Field Completeness
Consistency
Internal Consistency
Cross-Field Consistency
Format Consistency
Timeliness
Currency of Data
Freshness Requirements
Update Frequency
Uniqueness
Duplicate Detection
Entity Resolution
Record Deduplication
Validity
Format Validation
Range Validation
Business Rule Compliance
Relevance
Business Context Alignment
Use Case Appropriateness
Feature Relevance
Integrity
Referential Integrity
Entity Integrity
Domain Integrity
Data Profiling and Assessment
Purpose of Data Profiling
Initial Data Exploration
Data Discovery Process
Summary Statistics Generation
Central Tendency Measures
Mean
Median
Mode
Dispersion Measures
Standard Deviation
Variance
Range
Interquartile Range
Distribution Characteristics
Skewness
Kurtosis
Percentiles
Frequency Analysis
Value Counts
Frequency Distributions
Categorical Frequencies
Data Visualization for Inspection
Univariate Visualizations
Histograms
Box Plots
Density Plots
Bivariate Visualizations
Scatter Plots
Correlation Heatmaps
Cross-Tabulations
Multivariate Visualizations
Pair Plots
Parallel Coordinates
Dimensionality Reduction Plots
Data Type and Structure Analysis
Identifying Data Types
Schema Validation
Structural Patterns
Nested Data Structures
Anomaly and Pattern Detection
Statistical Anomalies
Pattern Recognition
Trend Analysis
Seasonal Patterns
Previous
1. Introduction to Data Cleaning
Go to top
Next
3. The Data Cleaning Workflow