Skip to content

YouXam/cuda-practice-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUDA Practice Tutorial

English | 简体中文

This is a practical CUDA programming tutorial designed to help readers master the basic concepts and common operations of CUDA parallel computing through hands-on exercises. The content covers fundamental operations such as vector addition, matrix operations, convolution, and parallel reduction, deepening the understanding of GPU parallel acceleration through practice.

Environment Setup

First, ensure your computer/server has an available Nvidia GPU, then download and install the CUDA Toolkit and the corresponding driver from the Nvidia official website. For installation instructions, refer to the CUDA Quick Start Guide.

Clone this repository:

git clone https://github.com/youxam/cuda-practice-tutorial.git
cd cuda-practice-tutorial

Generate a working directory:

python3 generate.py [path]

For example:

python3 generate.py ~/cuda-practice-projects

The generated directory contains 9 exercises, each with a corresponding README.md that includes a complete tutorial, problem description, and explanations. You can also read them online.

  1. Problem 1: Vector Addition
  2. Problem 2: SAXPY
  3. Problem 3: 1D Stencil
  4. Problem 4: Matrix Transposition
  5. Problem 5: Parallel Reduction Sum
  6. Problem 6: 2D Convolution
  7. Problem 7: Tiled Matrix Multiplication
  8. Problem 8: Histogram
  9. Problem 9: K-means Clustering

You should complete each problem by following these steps:

  1. Read the problem description and requirements to understand the functionality you need to implement.
  2. Based on the requirements, implement the // TODO sections of the student.cu file.
  3. Use make list to view the list of test cases. Use make run TC=<test_case_prefix> to compile and run a specific test case. Use make test to compile and run all test cases.
  4. If you encounter difficulties, refer to the answer.cu file for a reference implementation to understand the approach and how it is implemented.

If you need to configure LSP, you can refer to the .clangd file in this project.

Learning Resources

About

A hands-on CUDA programming tutorial to master GPU parallel computing basics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published