Skip to content

Add triton to kernel bench #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

PaliC
Copy link
Collaborator

@PaliC PaliC commented Feb 4, 2025

Adds triton support to kernel bench (not including CoT or multiturn).

The simple bit is just adding a triton prompt and support for switching from cuda to triton.

The more risky bit is evaluation. Because triton usually uses decorators like @triton.jit which are not supported in exec, now instead of taking the model from generated code using exec, we use a hacky solution of writing a temp file and importing directly from that file. Unfortunately, that temp file has to be deleted manually, but afaict (without just using the on disk source file for the generated code which we could do), there isn't really another way to cleanly run decorators outside of modifying the generated code.

Test Plan:

I ran the following commands and things seemed to work as expected:

python scripts/generate_samples.py run_name="test_hf_level_
1" dataset_src="huggingface" level="1" num_workers=50 server_type="deepseek" model_name="deepseek-coder" temperature=0 framework="cuda"

python scripts/eval_from_generations.py level=1 run_name="test_hf_level_1" dataset_src="local" level="1" num_gpu_devices=8 timeout=300

python scripts/generate_samples.py run_name="test_hf_level_
1_triton" dataset_src="huggingface" level="1" num_workers=50 server_type="deepseek" model_name="deepseek-coder" temperature=0 framework="triton"

scripts/eval_from_generations.py level=1 run_name="test_hf_level_1_triton" dataset_src="local" level="1" num_gpu_devices=8 timeout=300

python scripts/generate_and_eval_single_sample.py dataset_src="huggingface" level=2 problem_id=40

python scripts/generate_and_eval_single_sample.py dataset_src="huggingface" level=2 problem_id=40 framework="triton"

@PaliC PaliC marked this pull request as ready for review February 4, 2025 21:33
@Zacharias030
Copy link
Contributor

Zacharias030 commented Feb 5, 2025

Would it make sense to add the artefacts produced (ie, results) of one of the LLMs at least that can be obtained when executing via this PR to its description for reference?

For anything that we do with this new "KernelBench-Triton" variant, it might prove helpful to have some expected numbers to compare against in order to check correctness of this and subsequent implementations.

Appending a bunch of the generated prompt->response pairs of both success and failure cases may also help us convincing ourselves that everything makes sense as suggested in here. For example, if any particular model should obtain a 0% score, I think we should quickly rule out that a trivial issue is causing that.

@simonguozirui
Copy link
Collaborator

@PaliC once you have the format decided,
Let's try repro an end-to-end flow, where we take the KernelBench program -> LM -> PyTorch program with jit Triton (in the format of you specified). Let's also have a verification script like run_and_check to verify.

Another thing @msaroufim mentioned is we might need to filter out how many KernelBench problems are purely functional to satisfy the format.

Comment on lines +504 to 505
def prompt_fix_compile(ref_arch_src, custom_kernel, metadata):
prompt = PROBLEM_STATEMENT
Copy link
Contributor

@Zacharias030 Zacharias030 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this PR!
Btw, this PROBLEM_STATEMENT and some other things are easily caught by a linter.

@PaliC PaliC closed this Mar 18, 2025
@PaliC
Copy link
Collaborator Author

PaliC commented Mar 18, 2025

As this PR is very stale and breaks KernelBench upstream quite a bit, please move discussion to #35 which does the same thing on the current iteration of kernelbench.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants