UsefulLinks
Computer Science
Programming
GPU Programming
1. Introduction to Parallel Computing and GPU Architecture
2. GPU Programming Models and APIs
3. Fundamentals of CUDA Programming
4. Intermediate CUDA Programming
5. Performance Optimization and Profiling
6. Advanced CUDA Programming
7. OpenCL Programming
8. Alternative GPU Programming Frameworks
9. Parallel Algorithms and Patterns
10. Applications and Case Studies
11. Performance Analysis and Optimization
12. Debugging and Testing
7.
OpenCL Programming
7.1.
OpenCL Architecture and Concepts
7.1.1.
Platform Model
7.1.1.1.
Host and Device Abstraction
7.1.1.2.
Platform Hierarchy
7.1.1.3.
Device Types
7.1.2.
Execution Model
7.1.2.1.
Kernels and Work-Items
7.1.2.2.
Work-Groups and NDRange
7.1.2.3.
Execution Ordering
7.1.3.
Memory Model
7.1.3.1.
Memory Regions
7.1.3.2.
Memory Objects
7.1.3.3.
Memory Consistency
7.1.4.
Programming Model
7.1.4.1.
Host Program Structure
7.1.4.2.
Kernel Development
7.1.4.3.
Synchronization Model
7.2.
OpenCL Programming Workflow
7.2.1.
Platform and Device Discovery
7.2.1.1.
Platform Enumeration
7.2.1.2.
Device Selection
7.2.1.3.
Capability Querying
7.2.2.
Context and Queue Management
7.2.2.1.
Context Creation
7.2.2.2.
Command Queue Setup
7.2.2.3.
Multiple Device Contexts
7.2.3.
Program and Kernel Objects
7.2.3.1.
Source Code Compilation
7.2.3.2.
Binary Program Loading
7.2.3.3.
Kernel Object Creation
7.2.4.
Memory Management
7.2.4.1.
Buffer Objects
7.2.4.2.
Image Objects
7.2.4.3.
Memory Mapping
7.3.
OpenCL vs. CUDA Comparison
7.3.1.
Language Differences
7.3.1.1.
Syntax Variations
7.3.1.2.
Built-in Functions
7.3.1.3.
Compilation Model
7.3.2.
Performance Characteristics
7.3.2.1.
Optimization Differences
7.3.2.2.
Vendor-Specific Features
7.3.2.3.
Portability Trade-offs
7.3.3.
Ecosystem Comparison
7.3.3.1.
Tool Support
7.3.3.2.
Library Availability
7.3.3.3.
Community Resources
Previous
6. Advanced CUDA Programming
Go to top
Next
8. Alternative GPU Programming Frameworks