Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config: unified logic to retrieve text config #33219

Merged
merged 17 commits into from
Sep 4, 2024

Conversation

gante
Copy link
Member

@gante gante commented Aug 30, 2024

What does this PR do?

Expected in #32673, #33212

This PR unifies how we access the text config in composite models in a single function, instead of having to look for the right key. This avoids recurrent patterns of foo if hasattr(...) else bar.

[Previously discussed with @amyeroberts here]

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante force-pushed the unified_decoder_config branch from 9dd336d to df20b61 Compare August 30, 2024 17:34
@gante gante changed the title Config: unified logic to retrieve decoder config Config: unified logic to retrieve text config Aug 30, 2024
@gante gante marked this pull request as ready for review August 30, 2024 18:36
@gante gante force-pushed the unified_decoder_config branch from 807c582 to d456788 Compare August 30, 2024 18:54
@gante gante force-pushed the unified_decoder_config branch from 005297c to 5531106 Compare August 31, 2024 08:04
Comment on lines +1750 to +1751
if not self.test_resize_embeddings:
self.skipTest(reason="test_resize_embeddings is set to `False`")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(moved the skip here: no point in spending compute if we are going to skip the test)

@@ -675,14 +675,12 @@ def validate_test_components(test_case, task, model, tokenizer, processor):
# Avoid `IndexError` in embedding layers
CONFIG_WITHOUT_VOCAB_SIZE = ["CanineConfig"]
if tokenizer is not None:
config_vocab_size = getattr(model.config, "vocab_size", None)
# Removing `decoder=True` in `get_text_config` can lead to conflicting values e.g. in MusicGen
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can add a flag to a) return the first valid match; OR b) return all matches when there is more than one match.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool, thanks for making the code cleaner! One tiny question about the decoder arg, it is not clear to me why it's needed?

Returns the config that is meant to be used with text IO. On most models, it is the original config instance
itself. On specific composite models, it is under a set of valid names.

If `decoder` is set to `True`, then only search for decoder config names.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my own understanding, what does it mean to search in "decoder config names"? Is it somehow related to a model being an encoder-decoder or decoder-only?

From what I see, text_encoder is used in Musicgen only and we never used the decoder=False in transformers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed mostly for musicgen at the moment. Needing/wanting to use the encoder text config is uncommon, but there is at least one pair of use cases:

  1. resize_token_embeddings (has decoder=False)
  2. The tests for resize_token_embeddings

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @gante

Comment on lines 1082 to 1083
f"Multiple valid text configs were found in the model config: {valid_text_config_names}. "
"Either don't use `get_text_config`, as it is ambiguous -- access the text configs directly."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't entirely understand what this comment hints at doing 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I've updated the code but left this exception halfway 😅

@gante gante merged commit d750b50 into huggingface:main Sep 4, 2024
23 checks passed
@gante gante deleted the unified_decoder_config branch September 4, 2024 11:03
BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants