-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] PP support for embedding models and update docs #9090
Merged
Merged
Changes from 11 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
20b63ee
Add PP support for embeddings models
DarkLight1337 3a51d81
Add PP support for embedding model runner
DarkLight1337 49c8216
Test PP embedding models
DarkLight1337 d995174
Reduce the number of tests
DarkLight1337 b385f8b
Fix missing PP handling
DarkLight1337 c6ea4b1
Fix for real
DarkLight1337 0e1e0b0
Fix wrong default
DarkLight1337 d139c54
Fix mistral loading
DarkLight1337 6222fdb
Update imports
DarkLight1337 d5f2e15
Fix weight loading again
DarkLight1337 b77ab5c
Update test definitions
DarkLight1337 1b9f8d4
Fix import
DarkLight1337 0b6b931
Up the TP size
DarkLight1337 519f695
Add missing models to docs
DarkLight1337 59bf961
Add note for tests
DarkLight1337 dfad8fb
More explicit NOTE
DarkLight1337 9adc7c0
Fix wrong model name
DarkLight1337 21212ee
Downgrade to note
DarkLight1337 79f9c4e
Fix missing org
DarkLight1337 6e49835
Remove an L
DarkLight1337 e19fda6
Consolidate multimodal models
DarkLight1337 4e5cab5
Update docs/source/models/supported_models.rst
DarkLight1337 90ecb43
Update supported_models.rst
DarkLight1337 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hardcode of 4 will only work on 8 GPU machines. Might be a bit confusing, or comments should be added to adjust this as necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not need all of them as well, maybe a few as examples are sufficient with some comments otherwise needs to be kept up to date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some comments in the code. See if it works for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments look good to me, but would it be possible to run this test with
tp_base
of 8? I.e. does this test automatically work with 2 nodes? This doesn't need to block but just something to think about.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires 16 GPUs to run. What is the setup you used to run those models? I have a comment indicating that tp_base is just an indication of the model size and may have to be adjusted further.