GPU Programming

  1. Fundamentals of CUDA Programming
    1. CUDA Ecosystem and Setup
      1. CUDA Toolkit Components
        1. NVCC Compiler
          1. Runtime Libraries
            1. Development Tools
            2. Development Environment Setup
              1. Driver Installation
                1. Toolkit Installation
                  1. IDE Integration
                    1. Verification and Testing
                  2. CUDA Programming Model
                    1. Host and Device Concepts
                      1. Host Code (CPU)
                        1. Device Code (GPU)
                          1. Heterogeneous Computing Model
                          2. Kernels and Functions
                            1. Kernel Declaration
                              1. Device Functions
                                1. Host Functions
                                  1. Function Qualifiers
                                  2. Thread Hierarchy
                                    1. Grids
                                      1. Blocks
                                        1. Threads
                                          1. Thread Indexing
                                          2. Built-in Variables
                                            1. threadIdx
                                              1. blockIdx
                                                1. blockDim
                                                  1. gridDim
                                                    1. warpSize
                                                  2. First CUDA Programs
                                                    1. Hello World on GPU
                                                      1. Kernel Definition
                                                        1. Kernel Launch
                                                          1. Compilation Process
                                                          2. Vector Addition Example
                                                            1. Memory Allocation
                                                              1. Data Transfer
                                                                1. Kernel Implementation
                                                                  1. Result Verification
                                                                  2. Error Checking Fundamentals
                                                                    1. CUDA Error Codes
                                                                      1. Error Handling Macros
                                                                        1. Debugging Basics
                                                                      2. Memory Management
                                                                        1. Memory Spaces Overview
                                                                          1. Host Memory
                                                                            1. Device Memory
                                                                              1. Memory Hierarchy
                                                                              2. Basic Memory Operations
                                                                                1. cudaMalloc
                                                                                  1. cudaFree
                                                                                    1. cudaMemcpy
                                                                                      1. Memory Transfer Directions
                                                                                      2. Unified Memory
                                                                                        1. cudaMallocManaged
                                                                                          1. Automatic Data Migration
                                                                                            1. Page Faulting Mechanism
                                                                                              1. Performance Considerations