Closed
Description
We need a baseline for benchmarking distributed training, the baseline is:
- Select a model (e.g. vgg16)
- Data Parallelism, sync SGD
- Same cluster hardware
the variables to compare are:
- Different frameworks (PaddlePaddle vs. others)
- The number of nodes
- batch size
The performance baseline can be used to judge our framework's distributed training performance than. Both v1 and fluid should be tested.
Metadata
Metadata
Assignees
Labels
No labels