Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks to CI #479

Closed
wants to merge 1 commit into from
Closed

Commits on Aug 25, 2022

  1. Add benchmarks to CI (pytorch#479)

    Summary:
    ## Types of changes
    
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [X] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing functionality to change)
    - [ ] Docs change / refactoring / dependency upgrade
    
    ## Motivation and Context / Related issue
    There's a task pytorch#368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all  the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), and generated the following report:
    
    |              | memory*  | memory* | memory*     | memory* | memory*      | runtime                | runtime              | runtime            | runtime                | runtime            |
    |--------------|---------|--------|------------|--------|-------------|------------------------|----------------------|--------------------|------------------------|--------------------|
    | value        | control | dp     | dp/control | gsm    | gsm/control | control                | dp                   | dp/control         | gsm                    | gsm/control        |
    | base_layer   |         |        |            |        |             |                        |                      |                    |                        |                    |
    | conv         | 0.0     |        |            | 0.0    |             | 2.021756922606001      |                      |                    | 3.2889059911645036     | 1.6267563891534373 |
    | embedding    | 0.0     |        |            | 0.0    |             | 0.002484286398502263   |                      |                    | 0.013664713416999803   | 5.5004581698946    |
    | groupnorm    | 0.0     |        |            | 0.0    |             | 0.0001871487290072764  |                      |                    | 0.00043170701800136156 | 2.306759016165034  |
    | gru          | 0.0     | 0.0    |            | 0.0    |             | 0.045029744959007065   | 0.057370035271503174 | 1.2740475284443677 | 0.2402042072270033     | 5.334345274344187  |
    | instancenorm | 0.0     |        |            | 0.0    |             | 0.004493124293996517   |                      |                    | 0.006058429501005777   | 1.3483779002287433 |
    | layernorm    | 0.0     |        |            | 0.0    |             | 0.00011227587499979562 |                      |                    | 0.0002241125804985131  | 1.9960884784814286 |
    | linear       | 0.0     |        |            | 0.0    |             | 0.001010556231000001   |                      |                    | 0.003052972127999998   | 3.021080900148341  |
    | lstm         | 0.0     | 0.0    |            | 0.0    |             | 0.052634652085002925   | 0.06508583683050075  | 1.2365586975931682 | 0.2982182763324963     | 5.665816425477371  |
    | mha          | 0.0     | 0.0    |            | 0.0    |             | 0.018872260358001765   | 0.01870937360499738  | 0.9913689854890476 | 0.02688384014700477    | 1.424516175435558  |
    | rnn          | 0.0     | 0.0    |            | 0.0    |             | 0.01576623683249454    | 0.02184348723049516  | 1.3854597937711604 | 0.10178373254250346    | 6.455803856296582  |
    
    (*) This report wasn't generated on a machine with CUDA so the memory wasn't measured. Will update later when it runs in CI on a GPU machine.
    
    Using the report and section 3 in the [paper](https://arxiv.org/pdf/2109.12298.pdf), I parameterised the runtime and memory thresholds for different layers.
    
    ## How Has This Been Tested (if it applies)
     I ran the jobs locally and generated reports.
    
    ## Checklist
    
    - [X] The documentation is up-to-date with the changes I made.
    - [X] I have read the **CONTRIBUTING** document and completed the CLA (see **CONTRIBUTING**).
    - [ ] All tests passed, and additional code has been covered with new tests.
    
    Pull Request resolved: pytorch#479
    
    Differential Revision: D38999201
    
    Pulled By: moaradwan
    
    fbshipit-source-id: 3d02931970e39ea331674c9f0676db9e22c5edaa
    Attia Radwan authored and facebook-github-bot committed Aug 25, 2022
    Configuration menu
    Copy the full SHA
    baea28a View commit details
    Browse the repository at this point in the history