Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infer lmi engine #623
Infer lmi engine #623
Changes from 1 commit
f9a7238
2c56a95
e2d10e2
accc615
00930c6
475e609
90ec4d3
53f1653
1bb478c
707a0a1
0913113
f433f0c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe DS or FT would have some mechanism from their end to decide how to do model sharding. I would suggest to not check this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for DS, they will throw an exception if this check fails. But really the only practical examples of this we have seen is gpt2-xl.
In the future it's possible that DS and FT change that behavior and can actually accommodate such a model. At that point this method would become incorrect.
I can remove this, since it's going to be validated by the engine anyways. But the benefit of doing it this way is that we don't recommend say gpt2-xl to run with DeepSpeed with TP when we know it won't work.