Here are
131 public repositories
matching this topic...
🍎 One kernel a day keeps high latency away. A hands-on CUDA learning path featuring a rich collection of kernels, from the basics to peak performance, seamlessly integrated as PyTorch C++ extensions.
An implementation of parallel exclusive scan in CUDA
Updated
Feb 23, 2018
Cuda
CS344 - Introduction To Parallel Programming course (Udacity) proposed solutions
Updated
Jul 23, 2017
Cuda
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
Updated
Aug 12, 2024
Cuda
Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake using PyBind11
Updated
Oct 14, 2025
Cuda
CUDA C implementation of Principal Component Analysis (PCA) through Singular Value Decomposition (SVD) using a highly parallelisable version of the Jacobi eigenvalue algorithm.
Updated
May 10, 2019
Cuda
CUDA implementation of parallel Depth First Search (DFS) algorithm and it's comparison with a serial C++ DFS implementation.
Updated
Jun 28, 2018
Cuda
My GitHub Repo for UIUC ECE408 Applied Parallel Programming, mainly focus on CUDA programming and algorithm implementation.
Updated
Jan 16, 2024
Cuda
A collection of awesome algorithms, implemented in CUDA.
Updated
Nov 10, 2024
Cuda
ECE408 (Applied Parallel Programming) Fall 2022 MP
Updated
Mar 24, 2023
Cuda
GPU Parallel Computing software solution examples with CUDA
A simple library-less CUDA implementation of the OneSweep sorting algorithm.
Updated
Feb 26, 2024
Cuda
This is our Final Year Project titled " Implementation of seam carving for image retargeting using CUDA enabled GPU"
Updated
Nov 16, 2024
Cuda
Sample codes for parallel programming using OpenMP on CPU and CUDA on GPU
Updated
Jul 21, 2022
Cuda
This repository contains my coursework and projects completed during the GPU Programming Specialization offered by Johns Hopkins University
Updated
Jun 13, 2023
Cuda
A CUDA-accelerated convolutional autoencoder and SVM pipeline for CIFAR-10 image classification.
Updated
Dec 27, 2025
Cuda
C++ implementation of a neural network using OpenMP and CUDA for parallelization.
Updated
Nov 26, 2021
Cuda
K-Means clustering implementation with parallelization using OpenMP and CUDA for efficient computation.
Improve this page
Add a description, image, and links to the
parallel-programming
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
parallel-programming
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.