Skip to content

Commit 369cd3b

Browse files
Update README.md
1 parent 10e313c commit 369cd3b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This project is a from-scratch implementation of diffusion model training in C++
44

55
### **My Motivation:**
66

7-
As a Python programmer, I was fascinated by diffusion models but found the math and implementation details challenging. Meanwhile, because of my interest in ML systems and infrastructure, I also wanted to learn CUDA, and understand how to get the most out of GPUs. This project was born out of my desire to learn by doing, and to see if I could achieve performance comparable to, or even exceeding, PyTorch. Python can be slow, especially for computationally intensive tasks like training diffusion models, so the appeal of C++/CUDA's speed was undeniable.
7+
I was always fascinated by diffusion models but found the math and implementation details challenging. Meanwhile, because of my interest in ML systems and infrastructure, it was also on my bucket list to learn CUDA + GPU Programming, so I started this project to force myself to understand both diffusion models and gpu programming/cuda from the ground up :) Karpathy's llm.c was also a huge inspiration behind this. Both approaches allow for direct programming of the GPU hardware, which can lead to faster and more efficient training processes.
88

99
### **My Goal: Beating `torch.compile`**
1010

0 commit comments

Comments
 (0)