Skip to content

Commit

Permalink
Create README for distributed benchmark (pytorch#1183)
Browse files Browse the repository at this point in the history
Summary:
Add readme so distributed benchmark is easy to run and understand.

Pull Request resolved: pytorch#1183

Reviewed By: xuzhao9

Differential Revision: D39559898

Pulled By: erichan1

fbshipit-source-id: 2a77a72e3f03dd5acb2ad388412a0a1ef65e0a64
  • Loading branch information
erichan1 authored and facebook-github-bot committed Sep 16, 2022
1 parent 5ad5672 commit 67c6d71
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions userbenchmark/distributed/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
This is a benchmark for measuring PyTorch Distributed performance.

An example run command. Results are outputted as a json file in --job_dir FOLDER.
```
python run_benchmark.py distributed --ngpus 8 --nodes 1 --model torchbenchmark.e2e_models.hf_bert.Model --trainer torchbenchmark.util.distributed.trainer.Trainer --distributed ddp --job_dir $PWD/.userbenchmark/distributed/e2e_hf_bert --profiler False
```
Supported options (not-exhaustive):
* --distributed {torchbenchmark.e2e_models.hf_bert.Model, torchbenchmark.e2e_models.hf_t5.Model}
* --distributed {ddp, fsdp, deepspeed, none}
* --profiler {True, False}
* If set to True, returns one trace for every GPU, saved into --job_dir FOLDER.

0 comments on commit 67c6d71

Please sign in to comment.