Add `StableLM` #28810

jon-tow · 2024-02-01T07:45:10Z

What does this PR do?

This PR adds modeling support for StableLM 3B 4E1T (as well as StableLM 2 1.6B) based models.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker

Notes

TODO: The current online implementation uses an early naming scheme for the model_type

  "model_type": "stablelm_epoch",

I've temporarily created a development model repository https://huggingface.co/jon-tow/stablelm-3b-4e1t-dev for unit testing and config archive mapping which need to be updated before any merging.
Is there a better way to handle this? I've noticed a similar issue in this Phi model PR discussion.

ArthurZucker · 2024-02-01T13:12:58Z

Hey! Thanks for contributing!
We usually recommend to start by adding the model on the hub, to allow for a quick distribution!
See the tutorial here

jon-tow · 2024-02-01T15:16:39Z

Hello!

We usually recommend to start by adding the model on the hub, to allow for a quick distribution!

The model is already on the hub here but uses custom modeling code. Is your suggestion to simply rename the model_type in the config.json and remove the custom implementation? Sorry if I'm misinterpreting this!

ArthurZucker · 2024-02-02T07:32:28Z

No what I mean is I think it's fine to keep it on the hub! 🤗
We usually go for an integration if this is really asked by the community ( lots of activity on the repo / lots of activity on the issue for adding it here etc!)
Thought it's really great that you want to contribute!
If you still want to add it, I would recommend you to make it as close as possible to other modelling files like Llama or Persimmon, and otherwise good that you created a repo for dev 👍🏻

ArthurZucker

Looks already mergeable! Good work there 😉

README.md

src/transformers/models/stablelm/__init__.py

src/transformers/models/stablelm/modeling_stablelm.py

tests/models/stablelm/test_modeling_stablelm.py

jon-tow · 2024-02-08T01:55:06Z

Hi, @ArthurZucker; thanks for the quick review! I'd like to point out that the recent commit 097272f removes a copied-from comment from StableLmModel because PersimmonModel does not yet support flash-attn and the added _attn_implementation field breaks the make repo-consistency check. Let me know if you suggest a workaround 🙏

HuggingFaceDocBuilderDev · 2024-02-08T03:05:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Great work there !
Looks good to me, would you just mind me merging #27931 before?

This would mean you might have to use copied from mistral instead of Llama. Otherwise I'll merge this one and rebase on my side!

src/transformers/models/auto/configuration_auto.py

ArthurZucker · 2024-02-08T06:50:22Z

src/transformers/models/stablelm/configuration_stablelm.py

+logger = logging.get_logger(__name__)
+
+STABLELM_PRETRAINED_CONFIG_ARCHIVE_MAP = {
+    "jon-tow/stablelm-3b-4e1t-dev": "https://huggingface.co/jon-tow/stablelm-3b-4e1t-dev/resolve/main/config.json",


let's not forget to use original repo here! (opening a PR to the repo to upload the new config etc)

Thanks for the reminder! I've opened draft PRs for the base models:

https://huggingface.co/stabilityai/stablelm-3b-4e1t/discussions/10

https://huggingface.co/stabilityai/stablelm-2-1_6b/discussions/6

At what point should these be merged? I assume after the next release of transformers?

src/transformers/models/stablelm/configuration_stablelm.py

src/transformers/models/stablelm/modeling_stablelm.py

tests/models/stablelm/test_modeling_stablelm.py

ArthurZucker · 2024-02-12T05:57:19Z

Could you rebase on main and make sure CIs are all green? 🤗 I can help if you can't finish all of them

…mon`

jon-tow · 2024-02-12T16:36:44Z

Can you please help with the test_hubs workflow? It errors with FAILED tests/trainer/test_trainer.py::TrainerIntegrationWithHubTester::test_push_to_hub_with_saves_each_epoch - AssertionError: 'Training in progress, epoch 1' not found in ['Training in progress, epoch 3', 'Training in progress, epoch 2', 'initial commit'] (here). Not sure how to fix this one up 😅 Thanks!

…to add-stablelm

ArthurZucker

Double checked, LGTM!

jon-tow marked this pull request as ready for review February 6, 2024 23:05

ArthurZucker reviewed Feb 7, 2024

View reviewed changes

ArthurZucker approved these changes Feb 8, 2024

View reviewed changes

jon-tow added 2 commits February 12, 2024 14:56

Add StableLM

81bebe9

fix(model): re-create from `huggingface-cli add-new-model-like persim…

6a6a0ca

…mon`

jon-tow force-pushed the add-stablelm branch from 8704fac to 21fe181 Compare February 12, 2024 15:04

jon-tow force-pushed the add-stablelm branch from 004afb6 to e923955 Compare February 13, 2024 00:47

jon-tow changed the title ~~[WIP] Add StableLM~~ Add StableLM Feb 13, 2024

jon-tow force-pushed the add-stablelm branch 2 times, most recently from 9986ad1 to 6a6a0ca Compare February 13, 2024 04:34

jon-tow added 9 commits February 13, 2024 04:35

Merge branch 'main' of https://github.com/huggingface/transformers in…

d1b843b

…to add-stablelm

fix: re-add changes to address comments

59559f7

fix(readme): add links to paper

50c14dd

fix(tokenization_auto): remove GPTNeoXTokenizerFastFast ref

6f1a2c5

fix(tests): re-add @slow decorator to integration tests

fe0a498

fix(tests): import slow...

717e129

fix(readme_hd): remove whitespace edit

f46f869

fix(tokenizer): auto tokenizer tuple

5930bf6

skip doctests for modeling_stablelm

ae0df72

ArthurZucker approved these changes Feb 14, 2024

View reviewed changes

ArthurZucker merged commit de6029a into huggingface:main Feb 14, 2024
21 of 22 checks passed

ashishdatta mentioned this pull request Mar 1, 2024

Update to StableLM code ml-explore/mlx-examples#514

Merged

khipp mentioned this pull request Mar 10, 2024

Add missing localized READMEs to the copies check #29575

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `StableLM` #28810

Add `StableLM` #28810

jon-tow commented Feb 1, 2024 •

edited

Loading

ArthurZucker commented Feb 1, 2024

jon-tow commented Feb 1, 2024

ArthurZucker commented Feb 2, 2024

ArthurZucker left a comment

jon-tow commented Feb 8, 2024

HuggingFaceDocBuilderDev commented Feb 8, 2024

ArthurZucker left a comment

ArthurZucker Feb 8, 2024

jon-tow Feb 8, 2024 •

edited

Loading

ArthurZucker Feb 12, 2024

ArthurZucker commented Feb 12, 2024

jon-tow commented Feb 12, 2024

ArthurZucker left a comment

Add StableLM #28810

Add StableLM #28810

Conversation

jon-tow commented Feb 1, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

Notes

ArthurZucker commented Feb 1, 2024

jon-tow commented Feb 1, 2024

ArthurZucker commented Feb 2, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

jon-tow commented Feb 8, 2024

HuggingFaceDocBuilderDev commented Feb 8, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Feb 8, 2024

Choose a reason for hiding this comment

jon-tow Feb 8, 2024 • edited Loading

Choose a reason for hiding this comment

ArthurZucker Feb 12, 2024

Choose a reason for hiding this comment

ArthurZucker commented Feb 12, 2024

jon-tow commented Feb 12, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Add `StableLM` #28810

Add `StableLM` #28810

jon-tow commented Feb 1, 2024 •

edited

Loading

jon-tow Feb 8, 2024 •

edited

Loading