UsefulLinks
Computer Science
Programming
GPU Programming
1. Introduction to Parallel Computing and GPU Architecture
2. GPU Programming Models and APIs
3. Fundamentals of CUDA Programming
4. Intermediate CUDA Programming
5. Performance Optimization and Profiling
6. Advanced CUDA Programming
7. OpenCL Programming
8. Alternative GPU Programming Frameworks
9. Parallel Algorithms and Patterns
10. Applications and Case Studies
11. Performance Analysis and Optimization
12. Debugging and Testing
9.
Parallel Algorithms and Patterns
9.1.
Fundamental Parallel Patterns
9.1.1.
Map Pattern
9.1.1.1.
Element-wise Operations
9.1.1.2.
Embarrassingly Parallel Problems
9.1.1.3.
Implementation Strategies
9.1.2.
Reduce Pattern
9.1.2.1.
Parallel Reduction Algorithms
9.1.2.2.
Tree-based Reduction
9.1.2.3.
Warp-level Primitives
9.1.3.
Scan Pattern
9.1.3.1.
Prefix Sum Algorithms
9.1.3.2.
Inclusive vs. Exclusive Scan
9.1.3.3.
Applications and Use Cases
9.1.4.
Scatter and Gather Patterns
9.1.4.1.
Irregular Memory Access
9.1.4.2.
Data Reorganization
9.1.4.3.
Performance Considerations
9.2.
Advanced Algorithmic Patterns
9.2.1.
Stencil Computations
9.2.1.1.
Finite Difference Methods
9.2.1.2.
Boundary Conditions
9.2.1.3.
Optimization Techniques
9.2.2.
Graph Algorithms
9.2.2.1.
Breadth-First Search
9.2.2.2.
Shortest Path Algorithms
9.2.2.3.
Graph Traversal Patterns
9.2.3.
Sorting Algorithms
9.2.3.1.
Parallel Sorting Networks
9.2.3.2.
Radix Sort
9.2.3.3.
Merge Sort
9.2.4.
Matrix Operations
9.2.4.1.
Matrix Multiplication
9.2.4.2.
Decomposition Algorithms
9.2.4.3.
Sparse Matrix Operations
9.3.
Optimization Strategies
9.3.1.
Load Balancing
9.3.1.1.
Static vs. Dynamic Balancing
9.3.1.2.
Work Stealing
9.3.1.3.
Irregular Workloads
9.3.2.
Communication Minimization
9.3.2.1.
Data Locality
9.3.2.2.
Communication-Avoiding Algorithms
9.3.2.3.
Overlapping Communication and Computation
9.3.3.
Memory Access Optimization
9.3.3.1.
Tiling Strategies
9.3.3.2.
Cache Blocking
9.3.3.3.
Memory Coalescing
Previous
8. Alternative GPU Programming Frameworks
Go to top
Next
10. Applications and Case Studies