Useful Links
Computer Science
Programming
GPU Programming
1. Introduction to Parallel Computing and GPU Architecture
2. GPU Programming Models and APIs
3. Fundamentals of CUDA Programming
4. Intermediate CUDA Programming
5. Performance Optimization and Profiling
6. Advanced CUDA Programming
7. OpenCL Programming
8. Alternative GPU Programming Frameworks
9. Parallel Algorithms and Patterns
10. Applications and Case Studies
11. Performance Analysis and Optimization
12. Debugging and Testing
Alternative GPU Programming Frameworks
SYCL and DPC++
Single-Source Programming Model
Host and Device Code Integration
C++ Template Usage
Modern C++ Features
Abstraction Layers
Backend Independence
CUDA Backend
OpenCL Backend
CPU Backend
Programming Constructs
Queue and Handler Objects
Buffer and Accessor Model
Parallel Algorithms
DirectCompute
DirectX Integration
Graphics Pipeline Integration
Resource Sharing
Compute Shader Model
Programming Model
HLSL Compute Shaders
Thread Group Organization
Resource Binding
Performance Considerations
GPU Scheduling
Memory Management
Optimization Techniques
Vulkan Compute
Low-Level API Design
Explicit Control
Minimal Driver Overhead
Cross-Platform Support
Compute Pipeline
Pipeline Creation
Descriptor Sets
Command Buffer Recording
Advanced Features
Multi-Queue Execution
Memory Management
Synchronization Primitives
High-Level Frameworks
OpenACC
Directive-Based Programming
Compiler Pragmas
Incremental Parallelization
OpenMP Target Offloading
Target Directives
Data Mapping
Device Selection
Python GPU Libraries
PyCUDA
Numba
CuPy
JAX
Previous
7. OpenCL Programming
Go to top
Next
9. Parallel Algorithms and Patterns