Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache PyTorch source builds to reduce CI time #1500

Merged
merged 2 commits into from
Oct 18, 2022
Merged

Conversation

ashay
Copy link
Collaborator

@ashay ashay commented Oct 17, 2022

This PR contains two patches that reduce the time spent in out-of-tree
builds that reference the PyTorch source code from about 90 minutes to
about 15 minutes.

  • ci: cache PyTorch source builds

This patch reduces the time spent in regular CI builds by caching
PyTorch source builds. Specifically, this patch:

  1. Makes CI lookup the cache entry for the PyTorch commit hash in
    pytorch-version.txt
  2. If lookup was successful, CI fetches the previously-generated WHL
    file into the build_tools/python/wheelhouse directory
  3. CI sets the TM_PYTORCH_INSTALL_WITHOUT_REBUILD variable to true
  4. The build_libtorch.sh script then uses the downloaded WHL file
    instead of rebuilding PyTorch
  • ci: warm up PyTorch source cache during daily RollPyTorch action

This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR. We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.

ashay added 2 commits October 17, 2022 09:16
This patch reduces the time spent in regular CI builds by caching
PyTorch source builds.  Specifically, this patch:

1. Makes CI lookup the cache entry for the PyTorch commit hash in
   pytorch-version.txt
2. If lookup was successful, CI fetches the previously-generated WHL
   file into the build_tools/python/wheelhouse directory
3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true`
4. The build_libtorch.sh script then uses the downloaded WHL file
   instead of rebuilding PyTorch
This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR.  We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.
@ashay ashay requested a review from powderluv October 17, 2022 16:21
@ashay
Copy link
Collaborator Author

ashay commented Oct 17, 2022

Here is the first run, which took 2 hours to finish. The second run finished in 15 minutes.

Caches don't seem to be shared across branches, so this CI run will perform a full build of PyTorch, but subsequent CI runs should be faster.

Copy link
Collaborator

@powderluv powderluv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magical. thanks

@powderluv
Copy link
Collaborator

Only other piece we can do is tar.gz the LLVM build so OOT CI can just download that and we done in a min or two. We can check the submodule SHA for LLVM.

@ashay
Copy link
Collaborator Author

ashay commented Oct 18, 2022

tar.gz the LLVM build so OOT CI can just download that

That’s really neat idea! I’ll try it out in a separate PR.

@ashay ashay merged commit a9942f3 into main Oct 18, 2022
@ashay ashay deleted the ashay/pytorch-cache branch October 18, 2022 05:42
dellis23 pushed a commit that referenced this pull request Oct 25, 2022
* ci: cache PyTorch source builds

This patch reduces the time spent in regular CI builds by caching
PyTorch source builds.  Specifically, this patch:

1. Makes CI lookup the cache entry for the PyTorch commit hash in
   pytorch-version.txt
2. If lookup was successful, CI fetches the previously-generated WHL
   file into the build_tools/python/wheelhouse directory
3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true`
4. The build_libtorch.sh script then uses the downloaded WHL file
   instead of rebuilding PyTorch

* ci: warm up PyTorch source cache during daily RollPyTorch action

This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR.  We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants