-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarks to CI #479
Add benchmarks to CI #479
Conversation
@moaradwan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
7e395ba
to
263af2c
Compare
@moaradwan has updated the pull request. You must reimport the pull request before landing. |
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This pull request was exported from Phabricator. Differential Revision: D38999201 |
Summary: ## Types of changes - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Docs change / refactoring / dependency upgrade ## Motivation and Context / Related issue ## How Has This Been Tested (if it applies) ## Checklist - [ ] The documentation is up-to-date with the changes I made. - [X] I have read the **CONTRIBUTING** document and completed the CLA (see **CONTRIBUTING**). - [ ] All tests passed, and additional code has been covered with new tests. Pull Request resolved: pytorch#479 Differential Revision: D38999201 Pulled By: moaradwan fbshipit-source-id: b1999d6f4fca53fa6e3816f062a653565bbb5521
263af2c
to
2757765
Compare
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: ## Types of changes - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Docs change / refactoring / dependency upgrade ## Motivation and Context / Related issue There's a task pytorch#368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), and generated the following report: | | memory* | memory* | memory* | memory* | memory* | runtime | runtime | runtime | runtime | runtime | |--------------|---------|--------|------------|--------|-------------|------------------------|----------------------|--------------------|------------------------|--------------------| | value | control | dp | dp/control | gsm | gsm/control | control | dp | dp/control | gsm | gsm/control | | base_layer | | | | | | | | | | | | conv | 0.0 | | | 0.0 | | 2.021756922606001 | | | 3.2889059911645036 | 1.6267563891534373 | | embedding | 0.0 | | | 0.0 | | 0.002484286398502263 | | | 0.013664713416999803 | 5.5004581698946 | | groupnorm | 0.0 | | | 0.0 | | 0.0001871487290072764 | | | 0.00043170701800136156 | 2.306759016165034 | | gru | 0.0 | 0.0 | | 0.0 | | 0.045029744959007065 | 0.057370035271503174 | 1.2740475284443677 | 0.2402042072270033 | 5.334345274344187 | | instancenorm | 0.0 | | | 0.0 | | 0.004493124293996517 | | | 0.006058429501005777 | 1.3483779002287433 | | layernorm | 0.0 | | | 0.0 | | 0.00011227587499979562 | | | 0.0002241125804985131 | 1.9960884784814286 | | linear | 0.0 | | | 0.0 | | 0.001010556231000001 | | | 0.003052972127999998 | 3.021080900148341 | | lstm | 0.0 | 0.0 | | 0.0 | | 0.052634652085002925 | 0.06508583683050075 | 1.2365586975931682 | 0.2982182763324963 | 5.665816425477371 | | mha | 0.0 | 0.0 | | 0.0 | | 0.018872260358001765 | 0.01870937360499738 | 0.9913689854890476 | 0.02688384014700477 | 1.424516175435558 | | rnn | 0.0 | 0.0 | | 0.0 | | 0.01576623683249454 | 0.02184348723049516 | 1.3854597937711604 | 0.10178373254250346 | 6.455803856296582 | (*) This report wasn't generated on a machine with CUDA so the memory wasn't measured. Will update later when it runs in CI on a GPU machine. Using the report and section 3 in the [paper](https://arxiv.org/pdf/2109.12298.pdf), I parameterised the runtime and memory thresholds for different layers. ## How Has This Been Tested (if it applies) I ran the jobs locally and generated reports. ## Checklist - [X] The documentation is up-to-date with the changes I made. - [X] I have read the **CONTRIBUTING** document and completed the CLA (see **CONTRIBUTING**). - [ ] All tests passed, and additional code has been covered with new tests. Pull Request resolved: pytorch#479 Differential Revision: D38999201 Pulled By: moaradwan fbshipit-source-id: 3d02931970e39ea331674c9f0676db9e22c5edaa
2757765
to
baea28a
Compare
This pull request was exported from Phabricator. Differential Revision: D38999201 |
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Types of changes
Motivation and Context / Related issue
There's a task #368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), and generated the following report:
(*) This report wasn't generated on a machine with CUDA so the memory wasn't measured. Will update later when it runs in CI on a GPU machine.
Using the report and section 3 in the paper, I parameterised the runtime and memory thresholds for different layers.
How Has This Been Tested (if it applies)
I ran the jobs locally and generated reports.
Checklist