Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INF] Enable torch compile for inference #5612

Merged
merged 14 commits into from
Jul 9, 2024

Conversation

oelayan7
Copy link
Contributor

@oelayan7 oelayan7 commented Jun 4, 2024

No description provided.

@delock
Copy link
Collaborator

delock commented Jun 5, 2024

Hi @oelayan7 do you have sample code using this feature? Or does it turn on by default? I wonder if I can try it with CPU accelerator.

@oelayan7
Copy link
Contributor Author

oelayan7 commented Jun 9, 2024

Hi @delock,
No, it isn't run by default. The function .compile() needs to be called.
Torch compile was introduced in ZeRO and for training, which used CompileWrapper that was removed in #5581 and a new approach was introduced in this PR.
This change just adds this approach to inference as it was done in the other PRs.

@tjruwase tjruwase requested review from tohtana and umchand and removed request for arashb, awan-10 and mrwyattii June 9, 2024 22:45
@oelayan7
Copy link
Contributor Author

@tjruwase the failing tests are because an HTTP failure with HF.
Can you please retrigger? (couldn't find a way to do that)

@oelayan7
Copy link
Contributor Author

@tjruwase @tohtana @umchand Can anyone trigger the CI please?

Copy link
Contributor

@tohtana tohtana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @oelayan7 @delock !

I just found that nvtx decorator breaks the graph and has an impact on performance. Can you add deepspeed.utils.nvtx.enable_nvtx = False to compile() as in #569?

Overall this looks good to me. Let's merge after you add it.

Copy link
Contributor

@tohtana tohtana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @oelayan7 !

@tohtana tohtana enabled auto-merge June 25, 2024 08:37
@loadams loadams disabled auto-merge June 26, 2024 18:04
@oelayan7
Copy link
Contributor Author

@loadams Thanks!
I fixed the formatting issue, can you please retrigger?

@loadams loadams enabled auto-merge June 28, 2024 22:01
@oelayan7
Copy link
Contributor Author

@loadams I don't think the failures are related.

@loadams
Copy link
Contributor

loadams commented Jul 1, 2024

@loadams I don't think the failures are related.

@oelayan7 Correct, working on getting this merged despite some hiccups in our tests not related to your PR.

@loadams loadams disabled auto-merge July 3, 2024 15:03
@loadams
Copy link
Contributor

loadams commented Jul 8, 2024

@loadams I don't think the failures are related.

@oelayan7 Correct, working on getting this merged despite some hiccups in our tests not related to your PR.

@oelayan7 - we believe we should be merging all the CI fixes today and will get this merged shortly after that.

@loadams loadams enabled auto-merge July 8, 2024 22:53
@loadams loadams added this pull request to the merge queue Jul 9, 2024
Merged via the queue into microsoft:master with commit 7b1ea22 Jul 9, 2024
13 checks passed
@oelayan7 oelayan7 deleted the inf_compile branch July 9, 2024 04:40
mauryaavinash95 pushed a commit to DataStates/DeepSpeed that referenced this pull request Jul 10, 2024
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants