GPU Programming

  1. Parallel Algorithms and Patterns
    1. Fundamental Parallel Patterns
      1. Map Pattern
        1. Element-wise Operations
          1. Embarrassingly Parallel Problems
            1. Implementation Strategies
            2. Reduce Pattern
              1. Parallel Reduction Algorithms
                1. Tree-based Reduction
                  1. Warp-level Primitives
                  2. Scan Pattern
                    1. Prefix Sum Algorithms
                      1. Inclusive vs. Exclusive Scan
                        1. Applications and Use Cases
                        2. Scatter and Gather Patterns
                          1. Irregular Memory Access
                            1. Data Reorganization
                              1. Performance Considerations
                            2. Advanced Algorithmic Patterns
                              1. Stencil Computations
                                1. Finite Difference Methods
                                  1. Boundary Conditions
                                    1. Optimization Techniques
                                    2. Graph Algorithms
                                      1. Breadth-First Search
                                        1. Shortest Path Algorithms
                                          1. Graph Traversal Patterns
                                          2. Sorting Algorithms
                                            1. Parallel Sorting Networks
                                              1. Radix Sort
                                                1. Merge Sort
                                                2. Matrix Operations
                                                  1. Matrix Multiplication
                                                    1. Decomposition Algorithms
                                                      1. Sparse Matrix Operations
                                                    2. Optimization Strategies
                                                      1. Load Balancing
                                                        1. Static vs. Dynamic Balancing
                                                          1. Work Stealing
                                                            1. Irregular Workloads
                                                            2. Communication Minimization
                                                              1. Data Locality
                                                                1. Communication-Avoiding Algorithms
                                                                  1. Overlapping Communication and Computation
                                                                  2. Memory Access Optimization
                                                                    1. Tiling Strategies
                                                                      1. Cache Blocking
                                                                        1. Memory Coalescing