-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add reproducible compilation environment #3943
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@microsoft-github-policy-service agree |
This was referenced Jul 13, 2023
@fecet, thanks for this amazing PR that also includes documentation. We will review right away, apologies for the delay. |
loadams
reviewed
Jul 24, 2023
@@ -0,0 +1,20 @@ | |||
channels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to find somewhere not at the root of the repo to put this file, @mrwyattii - thoughts?
loadams
approved these changes
Jul 31, 2023
polisettyvarma
pushed a commit
to polisettyvarma/DeepSpeed
that referenced
this pull request
Aug 7, 2023
* add reproducible compilation environment * fix ci * fix typo for formatting check * Fix casing for format --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Logan Adams <loadams@microsoft.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, deepspeed only verifies the compilation process on Docker, which may not work/without privilege on many clusters . This makes precompiled deepspeed ops very challenging, especially considering that the compilation chain tools can vary significantly between different systems. There have been many issues complaining about their inability to compile ops in their own environment, say #3890 pytorch/pytorch#100557 #3358 #3067
#3944
Conda-forge provides a cross-platform compilation toolchain that, if we maintain a robust Conda environment, can make precompiled ops available to everyone and solve the issue of reproducibility.
I verify the environment on Arch LInux (should unset CUDA_PATH firstly, this is caused by https://archlinux.org/packages/extra/x86_64/cuda/) and Ubuntu 20.04 for pytorch and pytorch-nightly. For pytorch nightly,
DS_BUILD_AIO
should be used as it seems that op doesn't supportc++17
yet #3944. And parallel parallel build option should be disabled as #2885. The command isresult:
nightly
release