-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[Model] Add PLaMo2 #14323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[Model] Add PLaMo2 #14323
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
b7fb14d
Add PLaMo2 model at v0.6.3.post1
Alnusjaponica e58e384
Follow-up to the latest based on Jamba implementaion
Alnusjaponica b0f101e
Modify interfaces
Alnusjaponica a783e31
Add workaround to use IsHybrid interface
Alnusjaponica 5ffec2c
Update dependencies for test
Alnusjaponica 49dd3b0
Add test for plamo2 model
Alnusjaponica 68d3bed
Modify code comment
Alnusjaponica bee8035
Resolve mypy error
Alnusjaponica 1058777
Add plamo to test_registry
Alnusjaponica 80a9abb
Merge branch 'main' into add-plamo2
Alnusjaponica e86e46f
pip-compile
Alnusjaponica 7659755
pip-compile
Alnusjaponica e394371
Add workarounds to hundle the difference in config assumptions
Alnusjaponica 9d7efcc
Make workaround simple
Alnusjaponica b0b222e
Merge branch 'main' into add-plamo2
Alnusjaponica 121ab1d
Merge branch 'main' into add-plamo2
Alnusjaponica f4a6ac1
yapf
Alnusjaponica 9e01348
Added PLaMo to docs
Alnusjaponica d051b1f
Set trust_remote_code=true for PLaMo in the test
Alnusjaponica 1a7111b
Clean-up unused lines
Alnusjaponica b318d0f
Revert renaming final norm component on loading model
Alnusjaponica d8df40d
Clean-up PlamoConfig
Alnusjaponica a36caaf
Revert PlamoDecoder for class structure consistency with transformers
Alnusjaponica 7cbdc8c
Rename PlamoDecoder to Plamo2Decoder
Alnusjaponica 4368f63
Revert Plamo2DecoderLayer for consistency with transformers
Alnusjaponica a451011
Drop Plamo2MoE for consistency with transformers implementaion
Alnusjaponica 256957f
Minimize model's member renaming
Alnusjaponica 0f9f140
Move causal-conv1d installation to buildkite config
Alnusjaponica dc50e0a
Simplefy DenseMLP
Alnusjaponica c6adb46
Stop specifying use_mamba_kernels=False as a mamba kernel is installe…
Alnusjaponica 83f6be5
Remove nn.Linear for quantization support
Alnusjaponica 81a1954
Properly pass prefixes
Alnusjaponica 0ed0042
Stop using float16 when dtype=auto is specified.
Alnusjaponica 63283c1
Revert "Stop using float16 when dtype=auto is specified."
Alnusjaponica 9ac51c5
Merge branch 'main' into add-plamo2
Alnusjaponica 19fcd5f
Handle dtype for plamo2 in config
Alnusjaponica 3f44675
Update object names to plamo2-prefixed
Alnusjaponica f43d02a
Update object names to plamo2-prefixed in the tests
Alnusjaponica 2f3bed1
Merge branch 'main' into add-plamo2
Alnusjaponica 7b41a18
Fix Plamo2ForCausalLM class name
Alnusjaponica f5bf80a
Merge branch 'main' into add-plamo2
Alnusjaponica 313e050
Merge branch 'main' into add-plamo2
Alnusjaponica 0c8fb36
Split plamo2 initialization test for debugging purpose
Alnusjaponica File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In https://huggingface.co/pfnet/plamo-2-1b/blob/main/config.json the architecture is
PlamoForCausalLM
instead ofPlamo2ForCausalLM
. Is this a mistake?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we have another
PlamoForCausalLM
with a the different architecture (https://huggingface.co/pfnet/plamo-100b), and I usedPlamo2ForCausalLM
in vLLM to avoid misunderstandings. If it is necessary to use the same class name, I can ask our pre-training team if it's possible to rename the config. I apologize for any confusion."There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is a fairly minor thing but users will see a warning message like the following, which would be nice to fix:
You are using a model of type plamo2 to instantiate a model of type plamo. This is not supported for all configurations of models and can yield errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment. We internally agreed to change the class name to
Plamo2ForCausalLM
, so I'll be updating it here after our public models are updated.