Statistical Computing

  1. Data Management and Manipulation
    1. Data Import and Export
      1. File Formats
        1. Flat Files
          1. CSV Files
            1. TSV Files
              1. Fixed-Width Files
                1. Handling Delimiters and Encodings
                2. Binary Formats
                  1. RData and RDS Files
                    1. Pickle Files
                      1. HDF5 Format
                      2. Spreadsheet Files
                        1. Excel Files
                          1. OpenDocument Spreadsheets
                        2. Database Connectivity
                          1. Relational Databases
                            1. SQL Fundamentals
                              1. Database Connections
                                1. Querying Data
                                  1. Joins and Relationships
                                  2. NoSQL Databases
                                    1. Document Stores
                                      1. Key-Value Stores
                                        1. Graph Databases
                                      2. Web Data Sources
                                        1. APIs and Web Services
                                          1. REST APIs
                                            1. Authentication
                                              1. Rate Limiting
                                              2. Web Scraping
                                                1. HTML Parsing
                                                  1. Handling Dynamic Content
                                                  2. Data Formats
                                                    1. JSON Data
                                                      1. XML Data
                                                        1. Web APIs
                                                    2. Data Quality and Cleaning
                                                      1. Data Quality Assessment
                                                        1. Completeness
                                                          1. Accuracy
                                                            1. Consistency
                                                              1. Validity
                                                              2. Missing Data Handling
                                                                1. Types of Missing Data
                                                                  1. Missing Completely at Random (MCAR)
                                                                    1. Missing at Random (MAR)
                                                                      1. Missing Not at Random (MNAR)
                                                                      2. Missing Data Patterns
                                                                        1. Imputation Methods
                                                                          1. Mean/Median Imputation
                                                                            1. Forward/Backward Fill
                                                                              1. Multiple Imputation
                                                                                1. Model-Based Imputation
                                                                                2. Deletion Methods
                                                                                  1. Listwise Deletion
                                                                                    1. Pairwise Deletion
                                                                                  2. Data Validation and Error Detection
                                                                                    1. Outlier Detection
                                                                                      1. Statistical Methods
                                                                                        1. Visualization Methods
                                                                                          1. Robust Statistics
                                                                                          2. Data Consistency Checks
                                                                                            1. Range and Format Validation
                                                                                              1. Duplicate Detection and Removal
                                                                                              2. Data Standardization
                                                                                                1. Data Type Conversion
                                                                                                  1. Text Standardization
                                                                                                    1. Case Normalization
                                                                                                      1. String Cleaning
                                                                                                        1. Regular Expressions
                                                                                                        2. Date and Time Parsing
                                                                                                          1. Encoding Issues
                                                                                                        3. Data Transformation and Manipulation
                                                                                                          1. Data Selection and Filtering
                                                                                                            1. Row Selection
                                                                                                              1. Logical Filtering
                                                                                                                1. Random Sampling
                                                                                                                  1. Stratified Sampling
                                                                                                                  2. Column Selection
                                                                                                                    1. By Name
                                                                                                                      1. By Position
                                                                                                                        1. By Data Type
                                                                                                                      2. Variable Creation and Modification
                                                                                                                        1. Calculated Fields
                                                                                                                          1. Conditional Variables
                                                                                                                            1. Recoding Variables
                                                                                                                              1. Numeric Recoding
                                                                                                                                1. Categorical Recoding
                                                                                                                                  1. Binning and Discretization
                                                                                                                                2. Data Aggregation
                                                                                                                                  1. Grouping Operations
                                                                                                                                    1. Summary Statistics
                                                                                                                                      1. Custom Aggregation Functions
                                                                                                                                      2. Function Application
                                                                                                                                        1. Element-wise Functions
                                                                                                                                          1. Row-wise Functions
                                                                                                                                            1. Column-wise Functions
                                                                                                                                              1. Grouped Functions
                                                                                                                                            2. Data Reshaping and Restructuring
                                                                                                                                              1. Data Format Concepts
                                                                                                                                                1. Wide vs. Long Format
                                                                                                                                                  1. Tidy Data Principles
                                                                                                                                                  2. Reshaping Operations
                                                                                                                                                    1. Pivoting
                                                                                                                                                      1. Pivot Longer
                                                                                                                                                        1. Pivot Wider
                                                                                                                                                        2. Melting and Casting
                                                                                                                                                          1. Transposition
                                                                                                                                                          2. Data Combination
                                                                                                                                                            1. Joining Datasets
                                                                                                                                                              1. Inner Joins
                                                                                                                                                                1. Left Joins
                                                                                                                                                                  1. Right Joins
                                                                                                                                                                    1. Full Outer Joins
                                                                                                                                                                      1. Cross Joins
                                                                                                                                                                      2. Concatenation
                                                                                                                                                                        1. Row Binding
                                                                                                                                                                          1. Column Binding
                                                                                                                                                                          2. Set Operations
                                                                                                                                                                            1. Union
                                                                                                                                                                              1. Intersection
                                                                                                                                                                                1. Difference