-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc][Gaudi] Avoid torch.compile and enable lazy collectives by default for HPU lazy backend #10897
[Misc][Gaudi] Avoid torch.compile and enable lazy collectives by default for HPU lazy backend #10897
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
0c1a4f9
to
9dc0d44
Compare
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
9dc0d44
to
a2d7e3a
Compare
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
d391dee
to
a6d9c38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me cc @youkaichao to sign off before merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…roject#10897) Signed-off-by: Konrad Zawora <kzawora@habana.ai>
…roject#10897) Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Similar to #10747, but applied specifically to PT HPU lazy backend. While PyTorch for Gaudi has torch.compile support, it currently needs to be enabled explicitly, and best performance is achieved with HPUGraphs instead. This patch disables torch.compile for PT lazy mode and HPUGraphs (HPU execution modes for reference: https://docs.vllm.ai/en/latest/getting_started/gaudi-installation.html#execution-modes), and leaves it on for PT eager/torch.compile mode. Additionaly, it automatically sets the flag for enabling lazy collectives required in multi-HPU inference with HPUGraphs, fixing frequently reported
RuntimeError: collective nonSFG is not supported during hpu graph capturing
error.