-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix broken metrics in a
DistributedDataParallel
example (#9896)
Fixes a few issues: * Fixes the same three calls to `dist.all_reduce(train_acc, op=dist.ReduceOp.SUM)` (introduced in #8880) that led to the wrong metrics. * Avoids the `int(cuda_tensor)` call in every eval iteration to get rid of the D2H synchronization. * Calls `optimizer.step()` at the end of each training step to release GPU memory for gradients before evaluation loops. * Minor decorative changes: * Adds type annotations. * Adds a progress bar for the training loop. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
5ea6aec
commit 076db84
Showing
1 changed file
with
41 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters