Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move is_checkpointable call reducing torch.compile Graph breaks #5759

Merged

Conversation

NirSonnenschein
Copy link
Contributor

We have encountered a performance issue when running torch compile on a model utilizing
the pipeline engine (Mixtral).
The issue was found to be the is_checkpointable function which is called in the engine's forward function.
This function creates a graph break when using torch.compile leading to decreased performance (particularly since this happens in every forward call). We propose a change in the way is_checkpointable is checked by precomputing and storing its value before the forward call and accessing the stored values in the forward function.
given this change the graph break in the forward call is avoided which should lead to better performance for torch compile.

We have enoucountered a performance issue when running torch compile
on a model utilizing the pipeline engine (Mixtral).
The issue was found to be the is_checkpointable function which is called
in the engine's forward function.
This function creates a graph break when using torch.compile leading to
decreased performance (particularly since this happens in every forward call).
We propose a change in the way is_checkpointable is checked by
precomputing and storing its value before the forward call and accessing
the stored values in the forward function.
given this change the graph break in the forward call is avoided
which should lead to better performance for torch compile.
@NirSonnenschein NirSonnenschein requested a review from duli2012 as a code owner July 9, 2024 09:23
@NirSonnenschein
Copy link
Contributor Author

Hi @duli2012,
when you have a moment would it be possible to review this change.
thanks

@tjruwase tjruwase requested review from tohtana and umchand and removed request for duli2012 July 15, 2024 10:18
@tjruwase
Copy link
Contributor

Hi @duli2012, when you have a moment would it be possible to review this change. thanks

@tohtana, will help review.

@NirSonnenschein
Copy link
Contributor Author

Hi @tohtana,
when you have a moment would it be possible to review this change.
thanks

@tohtana
Copy link
Contributor

tohtana commented Jul 22, 2024

@NirSonnenschein This is a great improvement. Sorry for the delay. I just approved.

@tohtana tohtana enabled auto-merge July 22, 2024 15:49
@loadams
Copy link
Contributor

loadams commented Jul 23, 2024

The current test failures will be resolved after #5797 completes.

@tohtana tohtana added this pull request to the merge queue Jul 23, 2024
Merged via the queue into microsoft:master with commit 6d0dbf8 Jul 23, 2024
13 checks passed
@NirSonnenschein
Copy link
Contributor Author

Thanks @tohtana and @loadams

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants