Skip to content

Shreyans117/CUDA_Programming

Repository files navigation

CUDA_Programming

Implemented the following:

  1. Matrix Addition (memcpy operations and optimized number of thread blocks)
  2. Array Reduction (coalesced memory accesses)
  3. Matrix Multiplication (shared memory tiling)
  4. Histogram Analysis (atomic operations, shared memory)
  5. Ported an existing C based machine learning project to CUDA

About

CUDA Programming Projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages