-
Notifications
You must be signed in to change notification settings - Fork 29
Break the benchmarks to have separate executables #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break the benchmarks to have separate executables #349
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice one 👍
I am trying to upskill in CMake a little, and saw an opportunity to make some improvements here. Since we are all in forks, the easiest way I could find to communicate my suggestion is to provide this branch.
The idea basically is to automate the dependency handling by defining benchmark 'suites'. Each subdir now defines its own suite (which may be part of a larger suite). I added a mandatory SUITE
argument to cutlass_benchmark_add_executable
, so it should be impossible to make a mistake when adding more content in future.
Perhaps its overkill, please feel free to ignore! Equally, I can open it as a separate PR later.
Co-authored-by: Finlay <finlay.marno@codeplay.com>
Co-authored-by: Finlay <finlay.marno@codeplay.com>
Co-authored-by: Finlay <finlay.marno@codeplay.com>
Co-authored-by: Finlay <finlay.marno@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!!
Could you add a README file to the benchmark folder explaining how to run the benchmark?
Done in a0b2470 |
…tlass-fork into break_benchmarks
This PR breaks the benchmarks into multiple micro-benchmarks (1 per operator e.g. GEMM, Flash Attention Prefill etc.). It also adds the option to build and run all the benchmarks by compiling only 1 executable file (
ninja benchmarks
) similar to the examples and tests.