Skip to content

Add warmup_runs to TBE benchmarks and run at least 1 warmup iter #2163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sryap
Copy link
Contributor

@sryap sryap commented Nov 28, 2023

Summary:
We observe an extremely long cudaLaunchKernel time for the first
kernel launch after switching to CUDA 12. This causes the performance
of TBE to degrade if warmup_runs=0. This diff modifies
benchmark_requests which is used extensively in TBE benchmarks to
run at least one warm up iteration even when warmup_runs=0 to
exclude the first kernel time from the profiling result. Moreover, we
add --warmup-runs to every TBE benchmarks to allow the users to
increase the warmup iterations.

Differential Revision: D51603915

Summary:
We observe an extremely long `cudaLaunchKernel` time for the first
kernel launch after switching to CUDA 12.  This causes the performance
of TBE to degrade if `warmup_runs=0`.  This diff modifies
`benchmark_requests` which is used extensively in TBE benchmarks to
run at least one warm up iteration even when `warmup_runs=0` to
exclude the first kernel time from the profiling result.  Moreover, we
add `--warmup-runs` to every TBE benchmarks to allow the users to
increase the warmup iterations.

Differential Revision: D51603915
Copy link

netlify bot commented Nov 28, 2023

Deploy Preview for pytorch-fbgemm-docs canceled.

Name Link
🔨 Latest commit 263ace3
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/65654e89d2ba850008f1aaf6

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D51603915

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 91a600a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants