Skip to content

🌤🌤 CONTRIBUTE 🎉🎉 #50

Closed
@DefTruth

Description

@DefTruth

Developer Guide

🌤🌤 Goals

Any kernel implementation is welcome. This repository is primarily for learning/practice, and achieving optimal performance is not the final goal here—focus on "using first, then using well". For optimal performance, it is recommended to directly use official implementations such as cuBLAS, cuDNN, FlashAttention, TensorRT, etc. If you are interested in implementing a specific kernel in this repository, feel free to comment on this issue (though I may not be capable of implementing it 😅), for example:

👨‍💻👨‍💻 Pre-commit

Before submitting code, configure pre-commit, for example:

git@github.com:your-github-page/your-fork-LeetCUDA.git
cd your-fork-LeetCUDA && git checkout -b test
# install pre-commit
pip3 install pre-commit  
pre-commit install  
pre-commit run --all-files  

👨‍💻👨‍💻 Add a new kernel

Please check ./kernels/elementwise/ directory as an example.

# add new xxx_kernel
# add your commits  
git add .  
git commit -m "feat: add xxx kernel"  
git push  
# then, open a PR from your personal branch to LeetCUDA:main  

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions