Useful Links
Computer Science
Data Science
Data Science
1. Foundations of Data Science
2. Mathematical and Statistical Foundations
3. Computational Foundations and Tools
4. Data Acquisition and Management
5. Exploratory Data Analysis
6. Feature Engineering and Selection
7. Machine Learning Fundamentals
8. Advanced Machine Learning Topics
9. Big Data and Distributed Computing
10. Data Visualization and Communication
11. Model Deployment and MLOps
12. Ethics and Responsible AI
Data Acquisition and Management
Data Sources
Internal Data Sources
Transactional Systems
Customer Relationship Management Systems
Enterprise Resource Planning Systems
Log Files
External Data Sources
Public Datasets
Commercial Data Providers
Government Data
Social Media Data
Web Data
Real-time Data Sources
Streaming Data
IoT Sensors
API Feeds
Data Collection Methods
Surveys and Questionnaires
Design Principles
Sampling Methods
Response Bias
Observational Studies
Structured Observation
Unstructured Observation
Ethical Considerations
Experiments
Controlled Experiments
Natural Experiments
Quasi-experiments
Web Scraping
HTML Parsing
CSS Selectors
XPath
JavaScript Rendering
Rate Limiting
Legal and Ethical Considerations
Tools and Libraries
BeautifulSoup
Scrapy
Selenium
APIs
REST APIs
GraphQL APIs
Authentication Methods
Rate Limiting
Error Handling
API Documentation
File Formats and Data Storage
Structured Data Formats
CSV
TSV
Excel
JSON
XML
Parquet
Avro
ORC
Unstructured Data Formats
Text Files
Images
Audio
Video
Database Systems
Relational Databases
MySQL
PostgreSQL
SQLite
Oracle
SQL Server
NoSQL Databases
Document Stores
MongoDB
CouchDB
Key-Value Stores
Redis
DynamoDB
Column-Family
Cassandra
HBase
Graph Databases
Neo4j
Amazon Neptune
Data Quality Assessment
Data Quality Dimensions
Accuracy
Completeness
Consistency
Timeliness
Validity
Uniqueness
Data Profiling
Statistical Summaries
Pattern Analysis
Relationship Discovery
Anomaly Detection
Data Quality Metrics
Missing Value Rates
Duplicate Rates
Outlier Detection
Format Consistency
Data Cleaning and Preprocessing
Handling Missing Values
Missing Data Mechanisms
Missing Completely at Random
Missing at Random
Missing Not at Random
Imputation Techniques
Mean Imputation
Median Imputation
Mode Imputation
Forward Fill
Backward Fill
Interpolation
Predictive Imputation
Deletion Strategies
Listwise Deletion
Pairwise Deletion
Outlier Detection and Treatment
Statistical Methods
Z-score Method
IQR Method
Modified Z-score
Visualization Methods
Box Plots
Scatter Plots
Histograms
Machine Learning Methods
Isolation Forest
Local Outlier Factor
One-Class SVM
Treatment Strategies
Removal
Transformation
Capping
Binning
Data Type Conversion
Numeric Conversions
String Conversions
Date and Time Conversions
Boolean Conversions
Text Data Cleaning
Case Normalization
Whitespace Removal
Special Character Handling
Encoding Issues
Duplicate Detection and Removal
Exact Duplicates
Fuzzy Duplicates
Record Linkage
Data Integration
Data Merging Strategies
Horizontal Merging
Vertical Merging
Key-based Merging
Schema Matching
Attribute Correspondence
Data Type Alignment
Value Standardization
Entity Resolution
Record Matching
Deduplication
Identity Resolution
Data Transformation
Format Standardization
Unit Conversion
Coordinate System Transformation
Data Governance and Ethics
Data Governance Framework
Data Stewardship
Data Policies
Data Standards
Data Lineage
Privacy and Security
Data Anonymization
Data Masking
Encryption
Access Controls
Regulatory Compliance
GDPR
CCPA
HIPAA
SOX
Previous
3. Computational Foundations and Tools
Go to top
Next
5. Exploratory Data Analysis