Skip to content

per1cycle/sgemm-optimize

Repository files navigation

sgemm optimization learning notes

collections from share the project will automatically generate the plot script of different kernel result, enjoy it

matplotlib is only the requirement.

Requirements

  • cuda toolkits
  • cmake

Build instructions

pip install matplotlib
mkdir build
cd build
cmake ..
make -j
./runner | python

# or just run the ./runner.
# or run ./runner > demo.<your gpu>.py to save the output.
# this will generate the python script to plot.

About

a(N x M) @ b(M x k) = c(N x K) matmul cuda gemm optimization learning code/notes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published