Useful Links
Computer Science
Programming
GPU Programming
1. Introduction to Parallel Computing and GPU Architecture
2. GPU Programming Models and APIs
3. Fundamentals of CUDA Programming
4. Intermediate CUDA Programming
5. Performance Optimization and Profiling
6. Advanced CUDA Programming
7. OpenCL Programming
8. Alternative GPU Programming Frameworks
9. Parallel Algorithms and Patterns
10. Applications and Case Studies
11. Performance Analysis and Optimization
12. Debugging and Testing
OpenCL Programming
OpenCL Architecture and Concepts
Platform Model
Host and Device Abstraction
Platform Hierarchy
Device Types
Execution Model
Kernels and Work-Items
Work-Groups and NDRange
Execution Ordering
Memory Model
Memory Regions
Memory Objects
Memory Consistency
Programming Model
Host Program Structure
Kernel Development
Synchronization Model
OpenCL Programming Workflow
Platform and Device Discovery
Platform Enumeration
Device Selection
Capability Querying
Context and Queue Management
Context Creation
Command Queue Setup
Multiple Device Contexts
Program and Kernel Objects
Source Code Compilation
Binary Program Loading
Kernel Object Creation
Memory Management
Buffer Objects
Image Objects
Memory Mapping
OpenCL vs. CUDA Comparison
Language Differences
Syntax Variations
Built-in Functions
Compilation Model
Performance Characteristics
Optimization Differences
Vendor-Specific Features
Portability Trade-offs
Ecosystem Comparison
Tool Support
Library Availability
Community Resources
Previous
6. Advanced CUDA Programming
Go to top
Next
8. Alternative GPU Programming Frameworks