-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update DDP for torch.distributed.run
with gloo
backend
#3680
Merged
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
007902e
Update DDP for `torch.distributed.run`
glenn-jocher 9bcb4ad
Add LOCAL_RANK
glenn-jocher b32bae0
remove opt.local_rank
glenn-jocher b467501
backend="gloo|nccl"
glenn-jocher c886538
print
glenn-jocher 5d847dc
print
glenn-jocher 26d0ecf
debug
glenn-jocher 832ba4c
debug
glenn-jocher 9a1bb01
os.getenv
glenn-jocher 0e912df
gloo
glenn-jocher 5f5e428
gloo
glenn-jocher e8493c6
gloo
glenn-jocher fb342fc
cleanup
glenn-jocher 382ce4f
fix getenv
glenn-jocher b09b415
cleanup
glenn-jocher 9c4ac05
cleanup destroy
glenn-jocher 8ae9ea1
try nccl
glenn-jocher a18f933
merge master
glenn-jocher 2435775
return opt
glenn-jocher 56a4ab4
add --local_rank
glenn-jocher c4d839b
add timeout
glenn-jocher 0584e7e
add init_method
glenn-jocher d917341
gloo
glenn-jocher 6a1cc64
move destroy
glenn-jocher 3581c76
move destroy
glenn-jocher 5f5d122
move print(opt) under if RANK
glenn-jocher 5451fc2
destroy only RANK 0
glenn-jocher 9aa229e
move destroy inside train()
glenn-jocher 94363ce
restore destroy outside train()
glenn-jocher 9647379
update print(opt)
glenn-jocher cb8395d
merge master
glenn-jocher 96686fd
cleanup
glenn-jocher 446c610
nccl
glenn-jocher 49bb0b7
gloo with 60 second timeout
glenn-jocher b5decde
update namespace printing
glenn-jocher File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
gloo
- Loading branch information
commit e8493c6065c27b6dd36a521aa372a9e9d115fd0b
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nccl
should be the faster backend for ddp. I recall that Windows only supportgloo
however.