Mllama: update docs #34334

zucchini-nlp · 2024-10-23T07:20:28Z

What does this PR do?

Fixes #34304 and adds info about lm-head resizing. Maybe also fixes #33819?

ArthurZucker · 2024-10-24T13:41:13Z

docs/source/en/model_doc/mllama.md

+```python
+pre_expansion_embeddings = model.language_model.lm_head.weight.data
+mu = torch.mean(pre_expansion_embeddings, dim=0).float()
+n = pre_expansion_embeddings.size()[0]
+sigma = ((pre_expansion_embeddings - mu).T @ (pre_expansion_embeddings - mu)) / n
+dist = torch.distributions.multivariate_normal.MultivariateNormal(mu, covariance_matrix=1e-5 * sigma)
+
+
+num_new_tokens = 1 # 1 for the `"<|image|>"` token
+lm_head_weights = model.language_model.lm_head.weight
+
+new_token_embedding = torch.stack(tuple(dist.sample() for _ in range(num_new_tokens)), dim=0).to(device=lm_head_weights.device, dtype=lm_head_weights.dtype)
+lm_head_weights.data = torch.cat([lm_head_weights.data, new_token_embedding], dim=0)
+lm_head_weights.num_embeddings = lm_head_weights.data.shape[0]
+```


This should already be done internally if you use the correct flag for resize token embedding

from what I see we don;t let users to specify which embeddings to resize and use input_embeddings by default. In case weights are tied (not case of mllama) we also resize output embeddings

Or you mean there is another method similar to resize_token_embeddings? Might have overlooked that

We use the output of get_input_embeddings which should always be the input embedding and by default the output embedding from get_output_emebdding is resized when you are tied. But you are right, you can't only resize the lm head.

Tho ther might be some util function you can re-use no? 🤗 Feel free to merge!

right, there was a way to hide all the ugly code in private methods

HuggingFaceDocBuilderDev · 2024-10-25T09:16:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* update docs * be more explicit * use avaialble methods

zucchini-nlp added 2 commits October 23, 2024 09:19

update docs

09b5ece

be more explicit

a823c53

zucchini-nlp requested a review from ArthurZucker October 23, 2024 07:22

ArthurZucker approved these changes Oct 24, 2024

View reviewed changes

Merge branch 'main' into mllama-docs

e3e5b5e

use avaialble methods

9899363

zucchini-nlp merged commit 0f764a5 into huggingface:main Oct 30, 2024
9 checks passed

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Mllama: update docs (huggingface#34334)

97b5e04

* update docs * be more explicit * use avaialble methods

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mllama: update docs #34334

Mllama: update docs #34334

Uh oh!

zucchini-nlp commented Oct 23, 2024 •

edited

Loading

Uh oh!

ArthurZucker Oct 24, 2024

Uh oh!

zucchini-nlp Oct 25, 2024

Uh oh!

ArthurZucker Oct 29, 2024

Uh oh!

zucchini-nlp Oct 29, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Oct 25, 2024

Uh oh!

Uh oh!

Uh oh!

Mllama: update docs #34334

Mllama: update docs #34334

Uh oh!

Conversation

zucchini-nlp commented Oct 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

ArthurZucker Oct 24, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 25, 2024

Uh oh!

Uh oh!

Uh oh!

zucchini-nlp commented Oct 23, 2024 •

edited

Loading