-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add CFG for .generate() #24654
add CFG for .generate() #24654
Conversation
d3d6a59
to
8141f46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There are three missing bits before we can move forward:
1 - (I can take this one) adapt MusicGen, and merge its PR before this one
2 - Add a doc test using generate
, in UnbatchedClassifierFreeGuidanceLogitsProcessor
(example)
3 - Add a simple unit test (example; You may need to load a dummy model, example here)
Let me know if you'd want further pointers 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool already @Vermeille! Left some questions about the implementation and default values below
self.out = self.model( | ||
input_ids[:, -1:], | ||
attention_mask=torch.ones_like(input_ids[:, -1:]), | ||
use_cache=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinning use_cache=True
could be unexpected for the user if they have set model.config.use_cache=False
and then silently the model actually uses the k/v cache in the forward pass for the unconditional logits (requires higher memory usage, and so potentially silent OOMs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Any proposed way to fix it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we could restrict the CFG logits processor to only run if use_cache
is set to True
? In a similar way to how we do for assisted_generation
:
transformers/src/transformers/generation/utils.py
Lines 1494 to 1495 in 4957294
if not model_kwargs["use_cache"]: | |
raise ValueError("assisted generate requires `use_cache=True`") |
This is not ideal though, since users should have the option of running the model with/without cache as they desire.
Perhaps instead we could add an attribute self.use_cache
to the init of the logits processor, and set this based on whether the user has set use_cache
to True / False? Along the lines of:
def CFGLogitsProcessor(LogitsProcessor):
def __init__(..., use_cache):
...
self.use_cache = use_cache
-> we'll set use_cache=model_kwargs["use_cache"]
to get the correct argument here when we instantiate the logits processor.
WDYT @Vermeille @gante?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with requiring the use of cache to alleviate the implementation complexity -- we can revert it in the future if needed.
In practical terms, all modern models use a cache anyways ;)
@Vermeille -- @sanchit-gandhi raises good points about the attention mask and taking the first item of the batch in the unconditional logits. As it stands, it will only work with batch size = 1, and our logits processors should be flexible wrt batch size :) |
good catch with the batch size! As for the attention mask, could you guide me to a working solution with that? I'm quite unfamiliar with huggingface tbh. |
6d89135
to
8e92e30
Compare
Tests are on the way. |
All right. we only need to address use_cache / attention_mask.
Basically, I think .generate() had to answer the same questions so you guys will be able to answer them quite easily. Also I will need your guidance as for the API design to integrate this seamlessly. |
@gante I think we're good. The failure looks totally unrelated. |
Indeed. no more failed test. |
As I've replied in the dedicated thread, don't worry about the uncached case :) Make sure an exception is thrown, though!
That's a non-issue: |
negative_prompt (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): | ||
The negative prompt to use for CFG. Will throw an error if set but `cfg_scale` <= 1 or None. The batch size | ||
must match the input batch size. If unset, it will be defaulted to the last input token. | ||
negative_prompt_attention_mask (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): | ||
Attention_mask for `negative_prompt`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vermeille these two arguments are not part of the method configuration (similarly to the main prompt and its corresponding attention mask), so they have to be part of the signature of generate
and passed into _get_logits_processor
.
Should be an easy change 🤗
The documentation is not available anymore as the PR was closed or merged. |
3668958
to
ccf4a19
Compare
7c749bd
to
0d9c1d9
Compare
@gante looks like we're good now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(the music gen change has to be merged before this PR goes in, I'm working on it)
(@sgugger this one possibly did not get through your notifications, gently pinging :) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR! I agree with @sanchit-gandhi 's comment on model_kwargs
but this can merged and iterated upon if you prefer it that way @gante .
@Vermeille would you be able to retouch the tests? We can merge right after that change :) |
I'm currently in vacations. What's the problem with the tests? |
@Vermeille there are a few patterns in the tests that we usually avoid in our codebase (like thin wrappers). However, it's a minor issue, and this feature is being requested by the community, so I'm favoring merging it now. I understand the review process is long and somewhat tedious on your end. We err on the strict side, as we bear the cost of future maintenance. Thank you for collaborating with us, and looking forward to future contributions 🤗 Next steps: we will be communicating the feature on our end. Amplification and/or communication on your end will help bring awareness to the feature! 🔥 |
Thanks for your contribution @Vermeille and congrats on the PR! |
* Enable `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` (#24882) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add DINOv2 (#24016) * First draft * More improvements * Convert patch embedding layer * Convert all weights * Make conversion work * Improve conversion script * Fix style * Make all tests pass * Add image processor to auto mapping * Add swiglu ffn * Add image processor to conversion script * Fix conversion of giant model * Fix documentation * Fix style * Fix tests * Address comments * Address more comments * Remove unused arguments * Remove more arguments * Rename parameters * Include mask token * Address comments * Add docstring * Transfer checkpoints * Empty commit * [`InstructBlip`] Fix int8/fp4 issues (#24888) * fix dtype issue * revert `.float()` * fix copies * [`Blip`] Fix blip output name (#24889) * fix blip output name * add property * oops * fix failing test * check if eval dataset is dict (#24877) * check if eval dataset is dict * formatting * Separate CircleCI cache between `main` and `pull` (or other branches) (#24886) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`Llama2`] Add support for Llama 2 (#24891) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Disable ipex env var if false (#24885) Disable ipex if in use * Check for accelerate env var when doing CPU only (#24890) Check for use-cpu * Avoid some pipeline tasks to use `use_cache=True` (#24893) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Update tested versions in READMEs (#24895) * Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs. * Fix `test_model_parallelism` for `FalconModel` (#24914) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907) - This results in cpu mode on Apple Silicon mps * fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902) fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST suno/barh should be suno/bark * Fix minor llama2.md model doc typos (#24909) Update llama2.md Fix typos in the llama2 model doc * [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906) * add possibility to disable TP * fixup * adapt from offline discussions * [doc] `image_processing_vilt.py` wrong default documented (#24931) [doc] image_processing_vilt.py wrong default * 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588) * docs: ko: `document_question_answering.md` * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * Add multi-label text classification support to pytorch example (#24770) * Add text classification example * set the problem type and finetuning task * ruff reformated * fix bug for unseting label_to_id for regression * update README.md * fixed finetuning task * update comment * check if label exists in feature before removing * add useful logging * Deprecate unused OpenLlama architecture (#24922) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA * replace no_cuda with use_cpu in test_pytorch_examples (#24944) * replace no_cuda with use_cpu in test_pytorch_examples * remove codes that never be used * fix style * Generate: sequence bias can handle same terminations (#24822) * Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949) Bump pygments in /examples/research_projects/decision_transformer Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update processing_vision_text_dual_encoder.py (#24950) Fixing small typo: kwrags -> kwargs * Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [DOCS] Example for `LogitsProcessor` class (#24848) * make docs * fixup * resolved * remove debugs * Revert "fixup" This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed. * prev (ignore) * fixup broke some files * remove files * reverting modeling_reformer * lang fix * fix type annotations for arguments in training_args (#24550) * testing * example script * fix typehinting * some tests * make test * optional update * Union of arguments * does this fix the issue * remove reports * set default to False * documentation change * None support * does not need None * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad. * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * merge * hacky fix * fixup --------- Co-authored-by: Max Ryabinin <mryabinin0@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955) add GC support for RWKV * Change logic for logging in the examples (#24956) Change logic * Contrastive Search peak memory reduction (#24120) Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fallback for missing attribute `Parameter.ds_numel` (#24942) * [trainer] fallback for deepspeed param count * [trainer] more readable numel count * fix fsdp checkpointing issues (#24926) * fix fsdp load * Update trainer.py * remove saving duplicate state_dict * fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) * fix: cast input pixels to appropriate dtype for image_to_text tasks * fix: add casting to pixel inputs of additional models after running copy checks * 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fsdp fixes and enhancements (#24980) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py * Fix missing spaces in system prompt of Llama2 tokenizer (#24930) * Update tokenization_llama.py * Update tokenization_llama_fast.py * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [`LlamaConfig`] Nit: pad token should be None by default (#24958) * pad token should be None by default * fix tests * nits * Remove tokenizers from the doc table (#24963) * Avoid importing all models when instantiating a pipeline (#24960) * Avoid importing all models when instantiating a pipeline * Remove sums that don't work * Fix type annotation for deepspeed training arg (#24988) * Use main_input_name for include_inputs_for_metrics (#24993) * Fix `llama` tokenization doctest (#24990) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`bnb`] Add simple check for bnb import (#24995) add simple check for bnb * [`Llama`] remove persistent `inv_freq` tensor (#24998) remove persistent tensor * improve from_pretrained for zero3 multi gpus mode (#24964) * improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by: Stas Bekman <stas@stason.org> * Move template doc file to md (#25004) * 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686) fix: update ko/serialization.md * chatgpt draft * [check_config_docstrings.py] improve diagnostics (#25012) * [check_config_docstrings.py] improve diagnostics * style * rephrase * fix * [`logging.py`] set default `stderr` path if `None` (#25033) set default logger * fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035) fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> * [docs] Performance docs tidy up, part 1 (#23963) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Support GatedRepoError + use raise from (#25034) * Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Use token instead of use_auth_token in error messages --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Better handling missing SYS in llama conversation tokenizer (#24997) * Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl * 🌐[i18n-KO] Translated performance.md to Korean (#24883) * dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * Update docs/source/ko/performance.md --------- Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900) * docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * Add dispatch_batches to training arguments (#25038) * Dispatch batches * Copy items * Fix typo in LlamaTokenizerFast docstring example (#25018) * Make more test models smaller (#25005) * Make more test models tiny * Make more test models tiny * More models * More models * Comment again print statement * Pvt model (#24720) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md * compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by: Nathan Brake <nbrake3@mmm.com> * [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047) fix 8bit corner case with Blip2 8bit * 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911) * dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by: Haewon Kim <ehdvkf02@naver.com> --------- Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Better error message when signal is not supported on OS (#25049) * Better error message when signal is not supported on OS * Address review comments * [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code * Generate - add beam indices output in contrained beam search (#25042) * [Docs] fix rope_scaling doc string (#25072) fix rope_scaling doc string * 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904) * docs: ko: tf_xla.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966) * docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Fix last models for common tests that are too big. (#25058) * Fix last models for common tests that are too big. * Remove print statement * fix: add TOC anchor link (#25066) * Set `TF32` flag for PyTorch cuDNN backend (#25075) * Fix broken link in README_hd.md (#25067) Update README_hd.md * replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078) replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice * [`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030) * check max length is default * nit * update warning: no-longer deprecate * comment in the configuration_utils in case max length's default gets changed in the futur * 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968) * dos: ko: hpo_train.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Fix: repeat per sample for SAM image embeddings (#25074) Repeat per sample for SAM image embeddings * [`MPT`] Add MosaicML's `MPT` model to transformers (#24629) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [DOCS] add example NoBadWordsLogitsProcessor (#25046) * add example NoBadWordsLogitsProcessor * fix L764 & L767 * make style * 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920) * docs: ko: perf_infer_cpu.md * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/_toctree.yml * Update docs/source/ko/perf_infer_cpu.md * Update docs/source/ko/perf_infer_cpu.md 이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 동의합니다! 제가 원본에 너무 얽매여 있었네요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 말씀하신대로 원문에 너무 집착했던것 같습니다 Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 더 나은 어휘 사용에 감사드립니다! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 이 당시 '주기'란 용어를 생각해내질 못했네요... Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 좀 더 자연스러운 문맥이 됐네요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 굳이 원본 형식에 얽매일 필요가 없군요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Allow generic composite models to pass more kwargs (#24927) * fix * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [ `ForSequenceClassification`] Support `left` padding (#24979) * support left padding * nit * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * [`TF`] Also apply patch to support left padding (#25085) * tf versions * apply changes to other models * 3 models slipped through the cracks * Edit err message and comment in `test_model_is_small` (#25087) * Edit err message and comment in * put back 80M comment * [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets * Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner * Fix doctest (#25031) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096) Bump certifi in /examples/research_projects/lxmert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098) Bump certifi in /examples/research_projects/decision_transformer Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix tied_params for meta tensor (#25101) * fix tied_params for meta tensor * remove duplicate * documentation for llama2 models (#25102) * fix documentation * changes * 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828) * translated pipeline_webserver.md Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Update pipeline_webserver.md * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com> --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com> * Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add descriptive docstring to TemperatureLogitsWarper (#24892) * Add descriptive docstring to TemperatureLogitsWarper It addresses https://github.com/huggingface/transformers/issues/24783 * Remove niche features Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Commit suggestion Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Refactor the examples to simpler ones * Add a missing comma Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Make args description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Remove extra text after making description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix linter --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com> * update `use_auth_token` -> `token` (#25083) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix past CI after #24334 (#25113) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move common image processing methods to BaseImageProcessor (#25089) Move out common methods * Fix ViT docstring regarding default dropout values. (#25118) Fix docstring for dropout. * MaskFormer - enable return_dict in order to compile (#25052) * Enable return_dict in order to compile * Update tests * Move center_crop to BaseImageProcessor (#25122) * fix deepspeed load best model at end when the model gets sharded (#25057) * fix delete all checkpoints when save_total_limit is set to 1 (#25136) * [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) default legacy to None * Clarify 4/8 bit loading log message (#25134) * clarify 4/8 bit loading log message * make style * 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) * Change defaults * Sylvain's comments * [`MptConfig`] support from pretrained args (#25116) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add offload support to Bark (#25037) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * More `token` things (#25146) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add bloom flax (#25094) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add new model in doc table of content (#25148) * Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) * Fix .push_to_hub and cleanup get_full_repo_name usage * Do not rely on Python bool conversion magic * request changes * Add test when downloading from gated repo (#25039) * override .cuda() to check if model is already quantized (#25166) * Represent query_length in a different way to solve jit issue (#25164) Fix jit trace * make run_generation more generic for other devices (#25133) * make run_generation more generic for other devices * use Accelerate to support any device type it supports. * make style * fix error usage of accelerator.prepare_model * use `PartialState` to make sure everything is running on the right device --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com> * added compiled model support for inference (#25124) * added compiled model support for inference * linter * Fix tests * linter * linter * remove inference mode from pipelines * Linter --------- Co-authored-by: amarkov <alexander@inworld.ai> * Update `use_auth_token` -> `token` in example scripts (#25167) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`Mpt`] Fix mpt slow test (#25170) fix mpt slow test * [`InstructBlip`] Fix instructblip slow test (#25171) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py * 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881) * docs: ko: transformers_agents.md * docs: ko: transformers_agents.md * feat: deepl draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> --------- Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> * Fix beam search to sample at least 1 non eos token (#25103) (#25115) * [MusicGen] Fix integration tests (#25169) * move to device * update with cuda values * fix fp16 * more rigorous * 🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list * Musicgen: CFG is manually added (#25173) * Better error message in `_prepare_output_docstrings` (#25202) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206) wrap `cuda` and `to` method correctly * Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [quantization.md] fix (#25190) Update quantization.md * [`pipeline`] revisit device check for pipeline (#25207) * revisit device check for pipeline * let's raise an error. * Update tiny model info. and pipeline testing (#25213) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix docker image build failure (#25214) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193) make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * [`Pix2Struct`] Fix pix2struct cross attention (#25200) * fix pix2struct cross attention * fix torchscript slow test * [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well * [`Detr`] Fix detr BatchNorm replacement issue (#25230) * fix detr weird issue * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * fix copies --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Move rescale dtype recasting to match torchvision ToTensor (#25229) Move dtype recasting to match torchvision ToTensor * Fix set of model parallel in the Trainer when no GPUs are available (#25239) * fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to * add pathname and line number to logging formatter in debug mode (#25203) * add pathname and lineno to logging formatter in debug mode * use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno * Add `token` arugment in example scripts (#25172) * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * resolving zero3 init when using accelerate config with Trainer (#25227) * resolving zero3 init when using accelerate config with Trainer * refactor * fix * fix import * Update rescale tests - cast to float after rescaling to reflect #25229 (#25259) Rescale tests - cast to float after rescaling to reflect #25229 * Fix some bugs for two stage training of deformable detr (#25045) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai> * [DOCS] Add example and modified docs of EtaLogitsWarper (#25125) * added example and modified docs for EtaLogitsWarper * make style * fixed styling issue on 544 * removed error info and added set_seed * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updated the results --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix return_dict_in_generate bug in InstructBlip generate function (#25246) Fix bug in InstructBlip generate function Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`). This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object. * Remove `pytest_options={"rA": None}` in CI (#25263) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943) * doc: ko: perf_infer_gpu_many.mdx * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/perf_infer_gpu_many.md Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * recommend DeepSpeed's Argument Parsing documentation (#25268) * [MMS] Fix mms (#25267) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py * CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) * CI with layers=2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * CI with `pytest_num_workers=8` for torch/tf jobs (#25274) n8 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Docs: Update list of `report_to` logging integrations in docstring (#25281) * Update list of logging integrations in docstring Also update type hint * Also add 'flyte' to report_to callback list * Revert 'report_to' type hint update Due to CLI breaking * Update InstructBLIP & Align values after rescale update (#25209) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests * Docs: separate generate section (#25235) Separate generate doc section * Update bark doc (#25234) * add mention to optimization in Bark docs * add offload mention in docs * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update bark docs. * Update bark.md --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add generate method to SpeechT5ForTextToSpeech (#25233) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add timeout parameter to load_image function (#25184) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add parameter to docs. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [JAX] Bump min version (#25286) * [JAX] Bump min version * make fixup * [small] llama2.md typo (#25295) `groupe` -> `grouped` * Fix typo: Roberta -> RoBERTa (#25302) * Move usage of deprecated logging.warn to logging.warning (#25310) The former spelling is deprecated and has been discouraged for a while. The latter spelling seems to be more common in this project anyway, so this change ought to be safe. Fixes https://github.com/huggingface/transformers/issues/25283 * Give more memory in test_disk_offload (#25315) * Generate: get generation mode as an enum (#25292) * Add offline mode for agents (#25226) * Add offline mode for agents * Disable second check too * Deal with nested configs better in base class (#25237) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Document check copies (#25291) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes * Make `bark` could have tiny model (#25290) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Document toc check and doctest check scripts (#25319) * Clean doc toc check and make doctest list better * Add to Makefile * [Whisper] Better error message for outdated generation config (#25298) * Remove jnp.DeviceArray since it is deprecated. (#24875) * Remove jnp.DeviceArray since it is deprecated. * Replace all instances of jnp.DeviceArray with jax.Array * Update src/transformers/models/bert/modeling_flax_bert.py --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add CFG for .generate() (#24654) * 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978) * docs: ko: perf_infer_gpu_one * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: TaeYupNoh <107118671+TaeYupNoh@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: TaeYupNoh <107118671+TaeYupNoh@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update TF pin in docker image (#25343) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generalize CFG to allow for positive prompts (#25339) * Generalize CFG to allow for positive prompts * Add documentation, fix the correct class * Loosen output shape restrictions on GPT-style models (#25188) * Loosen output shape restrictions on GPT-style models * Use more self-explanatory variables * Revert "Use more self-explanatory variables" This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5. * Allow `trust_remote_code` in example scripts (#25248) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5. * fix: duplicated argument * Generate: remove Marian hack (#25294) Remove Marian hack * Fix more offload edge cases (#25342) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Migrate Trainer from `Repository` to `upload_folder` (#25095) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com> * Adding more information in help parser on train_file and validation_file (#25324) chorse: adding new doc on train and val * [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186) * Add Description And Example to Docstring * make style corrections * make style * Doc Style Consistent With HF * Apply make style * Modify Docstring * Edit Type in Docstring * Feedback Incorporated * Edit Docstring * make style * Post Review Changes * Review Feedback Incorporated * Styling * Formatting * make style * pep8 * Docs: Added benchmarks for `torch.compile()` for vision models (#24748) * added benchmarks for compile * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * added more models * added more models fr * added visualizations * minor fix * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added links to models and put charts side by side * Added batch comparisons * Added more comparisons * Fix table * Added link to wheel * Update perf_torch_compile.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add mask2former fp16 support (#25093) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit e5371edabd301cf56079def0421a0a87df307cb0. * Remove dtype conversion noop * [DOCS] Add descriptive docstring to MinNewTokensLength (#25196) * Add descriptive docstring to MinNewTokensLength It addresses https://github.com/huggingface/transformers/issues/24783 * Refine the differences between `min_length` and `min_new_tokens` * Remove extra line * Remove extra arguments in generate * Add a missing space Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run the linter * Add clarification comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization * Fix `test_model_parallelism` (#25359) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add warning for missing attention mask when pad tokens are detected (#25345) * Add attention mask and pad token warning to many of the models * Remove changes under examples/research_projects These files are not maintained by HG. * Skip the warning check during torch.fx or JIT tracing * Switch ordering for the warning and input shape assignment This ordering is a little cleaner for some of the cases. * Add missing line break in one of the files * [ASR Pipeline] Clarify return timestamps (#25344) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment * MaskFormer, Mask2Former - replace einsum for tracing (#25297) * Replace einsum with ops for tracing * Fix comment * Load state in else (#25318) * Load else * New approach * Propagate * Fix `token` in example template (#25351) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Enable tests to run on third-party devcies (#25327) * enable unit tests to run on third-party devcies other than CUDA and CPU. * remove the modification that enabled ut on MPS * control test on third-party device by env variable * update --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com> * 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017) * docs: ko: add_tensorflow_model.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions * fix: manual edits * Fix `torch_job` worker(s) crashing (#25374) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: add config-level validation (#25381) * Fix missing usage of `token` (#25382) * add missing tokens * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Use small config for `OneFormerModelTest.test_model_with_labels` (#25383) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add copied from for image processor methods (#25121) * Add copied from statements for image processors * Move out rescale and normalize to base image processor * Remove rescale and normalize from vit (post rebase) * Update docstrings and tidy up * PR comments * change version (#25387) * [DOCS] Add example for `TopPLogitsWarper` (#25361) * [DOCS] Add example for `TopPLogitsWarper` * fix typo * address review feedback * address review nits * 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923) * docs: ko: perf_train_cpu_many.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * 16059 - Add missing type hints for ASTModel (#25364) * 16059 - Add missing type hints for ASTModel * Add an additional type hint Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * rm useless condition since the previous condition contains it. (#25403) * Fix path for dynamic module creation (#25402) * YOLOS - Revert default return_pixel_mask value (#25404) Revert default return_pixel_mask value * Docs: introduction to generation with LLMs (#25240) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Generate: length validation (#25384) * Improve training args (#25401) * enhanced tips for some training args * make style * Generate: generation config validation fixes in docs (#25405) * 16059 - Add extra type hints for AltCLIPModel (#25399) * Generate: lower severity of parameterization checks (#25407) * VQA task guide (#25244) * initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957) * docs: ko: add_new_model.md * feat: chatgpt draft * fix: manual edits * fix: change document title * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: add anchor to header * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * fix: edit with reviews * feat: edit toctree --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625) * docs: ko: model_summary.md * feat: nmt and manual edit model_summary.mdx * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * fix: resolve suggestions2 Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update Bark generation configs and tests (#25409) * update bark generation configs for more coherent parameter * make style * update bark hub repo * aligned sample_beam output selection with beam_search (#25375) * aligned sample_beam specs with beam_search * pull origin main * Revert "pull origin main" This reverts commit 06d356f1137bb52272e120a03636598c44449cf3. * update test_utils.py * fix format * remove comment --------- Co-authored-by: Shogo Fujita <shogo.fujita@legalontech.jp> * Enable passing number of channels when inferring data format (#25412) * Bark: flexible generation config overload (#25414) * [DINOv2] Update pooler output (#25392) Update pooler output * 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010) * docs: ko: philosophy.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Doc checks (#25408) * Document check_dummies * Type hints and doc in other files * Document check inits * Add documentation to * Address review comments * Generation: strict generation config validation at save time (#25411) * strict gen config save; Add tests * add note that the warning will be an exception in v4.34 * [WavLM] Fix Arxiv link and authors (#25415) * [WavLM] Fix Arxiv link and authors * make style * Generate: Load generation config when `device_map` is passed (#25413) * Fix rendering for `torch.compile()` docs (#25432) fix rendering * Add `examples` to tests to run when `setup.py` is modified (#25437) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix issue with ratio evaluation steps and auto find batch size (#25436) * Fully rebased solution * 500 * docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441) Co-authored-by: statelesshz <jihuazhong1@huawei.com> * GPTQ integration (#25062) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Y…
…xt2graph) (#8) * [`Llama2`] Add support for Llama 2 (#24891) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Disable ipex env var if false (#24885) Disable ipex if in use * Check for accelerate env var when doing CPU only (#24890) Check for use-cpu * Avoid some pipeline tasks to use `use_cache=True` (#24893) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Update tested versions in READMEs (#24895) * Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs. * Fix `test_model_parallelism` for `FalconModel` (#24914) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907) - This results in cpu mode on Apple Silicon mps * fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902) fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST suno/barh should be suno/bark * Fix minor llama2.md model doc typos (#24909) Update llama2.md Fix typos in the llama2 model doc * [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906) * add possibility to disable TP * fixup * adapt from offline discussions * [doc] `image_processing_vilt.py` wrong default documented (#24931) [doc] image_processing_vilt.py wrong default * 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588) * docs: ko: `document_question_answering.md` * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * Add multi-label text classification support to pytorch example (#24770) * Add text classification example * set the problem type and finetuning task * ruff reformated * fix bug for unseting label_to_id for regression * update README.md * fixed finetuning task * update comment * check if label exists in feature before removing * add useful logging * Deprecate unused OpenLlama architecture (#24922) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA * replace no_cuda with use_cpu in test_pytorch_examples (#24944) * replace no_cuda with use_cpu in test_pytorch_examples * remove codes that never be used * fix style * Generate: sequence bias can handle same terminations (#24822) * Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949) Bump pygments in /examples/research_projects/decision_transformer Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update processing_vision_text_dual_encoder.py (#24950) Fixing small typo: kwrags -> kwargs * Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [DOCS] Example for `LogitsProcessor` class (#24848) * make docs * fixup * resolved * remove debugs * Revert "fixup" This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed. * prev (ignore) * fixup broke some files * remove files * reverting modeling_reformer * lang fix * fix type annotations for arguments in training_args (#24550) * testing * example script * fix typehinting * some tests * make test * optional update * Union of arguments * does this fix the issue * remove reports * set default to False * documentation change * None support * does not need None * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad. * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * merge * hacky fix * fixup --------- Co-authored-by: Max Ryabinin <mryabinin0@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955) add GC support for RWKV * Change logic for logging in the examples (#24956) Change logic * Contrastive Search peak memory reduction (#24120) Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fallback for missing attribute `Parameter.ds_numel` (#24942) * [trainer] fallback for deepspeed param count * [trainer] more readable numel count * fix fsdp checkpointing issues (#24926) * fix fsdp load * Update trainer.py * remove saving duplicate state_dict * fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) * fix: cast input pixels to appropriate dtype for image_to_text tasks * fix: add casting to pixel inputs of additional models after running copy checks * 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fsdp fixes and enhancements (#24980) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py * Fix missing spaces in system prompt of Llama2 tokenizer (#24930) * Update tokenization_llama.py * Update tokenization_llama_fast.py * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [`LlamaConfig`] Nit: pad token should be None by default (#24958) * pad token should be None by default * fix tests * nits * Remove tokenizers from the doc table (#24963) * Avoid importing all models when instantiating a pipeline (#24960) * Avoid importing all models when instantiating a pipeline * Remove sums that don't work * Fix type annotation for deepspeed training arg (#24988) * Use main_input_name for include_inputs_for_metrics (#24993) * Fix `llama` tokenization doctest (#24990) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`bnb`] Add simple check for bnb import (#24995) add simple check for bnb * [`Llama`] remove persistent `inv_freq` tensor (#24998) remove persistent tensor * improve from_pretrained for zero3 multi gpus mode (#24964) * improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by: Stas Bekman <stas@stason.org> * Move template doc file to md (#25004) * 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686) fix: update ko/serialization.md * chatgpt draft * [check_config_docstrings.py] improve diagnostics (#25012) * [check_config_docstrings.py] improve diagnostics * style * rephrase * fix * [`logging.py`] set default `stderr` path if `None` (#25033) set default logger * fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035) fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> * [docs] Performance docs tidy up, part 1 (#23963) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Support GatedRepoError + use raise from (#25034) * Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Use token instead of use_auth_token in error messages --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Better handling missing SYS in llama conversation tokenizer (#24997) * Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl * 🌐[i18n-KO] Translated performance.md to Korean (#24883) * dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * Update docs/source/ko/performance.md --------- Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900) * docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * Add dispatch_batches to training arguments (#25038) * Dispatch batches * Copy items * Fix typo in LlamaTokenizerFast docstring example (#25018) * Make more test models smaller (#25005) * Make more test models tiny * Make more test models tiny * More models * More models * Comment again print statement * Pvt model (#24720) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md * compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by: Nathan Brake <nbrake3@mmm.com> * [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047) fix 8bit corner case with Blip2 8bit * 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911) * dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by: Haewon Kim <ehdvkf02@naver.com> --------- Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Better error message when signal is not supported on OS (#25049) * Better error message when signal is not supported on OS * Address review comments * [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code * Generate - add beam indices output in contrained beam search (#25042) * [Docs] fix rope_scaling doc string (#25072) fix rope_scaling doc string * 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904) * docs: ko: tf_xla.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966) * docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Fix last models for common tests that are too big. (#25058) * Fix last models for common tests that are too big. * Remove print statement * fix: add TOC anchor link (#25066) * Set `TF32` flag for PyTorch cuDNN backend (#25075) * Fix broken link in README_hd.md (#25067) Update README_hd.md * replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078) replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice * [`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030) * check max length is default * nit * update warning: no-longer deprecate * comment in the configuration_utils in case max length's default gets changed in the futur * 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968) * dos: ko: hpo_train.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Fix: repeat per sample for SAM image embeddings (#25074) Repeat per sample for SAM image embeddings * [`MPT`] Add MosaicML's `MPT` model to transformers (#24629) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [DOCS] add example NoBadWordsLogitsProcessor (#25046) * add example NoBadWordsLogitsProcessor * fix L764 & L767 * make style * 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920) * docs: ko: perf_infer_cpu.md * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/_toctree.yml * Update docs/source/ko/perf_infer_cpu.md * Update docs/source/ko/perf_infer_cpu.md 이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 동의합니다! 제가 원본에 너무 얽매여 있었네요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 말씀하신대로 원문에 너무 집착했던것 같습니다 Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 더 나은 어휘 사용에 감사드립니다! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 이 당시 '주기'란 용어를 생각해내질 못했네요... Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 좀 더 자연스러운 문맥이 됐네요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md 굳이 원본 형식에 얽매일 필요가 없군요! Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/perf_infer_cpu.md Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Allow generic composite models to pass more kwargs (#24927) * fix * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [ `ForSequenceClassification`] Support `left` padding (#24979) * support left padding * nit * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * [`TF`] Also apply patch to support left padding (#25085) * tf versions * apply changes to other models * 3 models slipped through the cracks * Edit err message and comment in `test_model_is_small` (#25087) * Edit err message and comment in * put back 80M comment * [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets * Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner * Fix doctest (#25031) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096) Bump certifi in /examples/research_projects/lxmert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098) Bump certifi in /examples/research_projects/decision_transformer Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix tied_params for meta tensor (#25101) * fix tied_params for meta tensor * remove duplicate * documentation for llama2 models (#25102) * fix documentation * changes * 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828) * translated pipeline_webserver.md Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Update pipeline_webserver.md * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com> --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com> * Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add descriptive docstring to TemperatureLogitsWarper (#24892) * Add descriptive docstring to TemperatureLogitsWarper It addresses https://github.com/huggingface/transformers/issues/24783 * Remove niche features Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Commit suggestion Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Refactor the examples to simpler ones * Add a missing comma Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Make args description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Remove extra text after making description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix linter --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com> * update `use_auth_token` -> `token` (#25083) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix past CI after #24334 (#25113) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move common image processing methods to BaseImageProcessor (#25089) Move out common methods * Fix ViT docstring regarding default dropout values. (#25118) Fix docstring for dropout. * MaskFormer - enable return_dict in order to compile (#25052) * Enable return_dict in order to compile * Update tests * Move center_crop to BaseImageProcessor (#25122) * fix deepspeed load best model at end when the model gets sharded (#25057) * fix delete all checkpoints when save_total_limit is set to 1 (#25136) * [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) default legacy to None * Clarify 4/8 bit loading log message (#25134) * clarify 4/8 bit loading log message * make style * 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) * Change defaults * Sylvain's comments * [`MptConfig`] support from pretrained args (#25116) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add offload support to Bark (#25037) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * More `token` things (#25146) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add bloom flax (#25094) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add new model in doc table of content (#25148) * Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) * Fix .push_to_hub and cleanup get_full_repo_name usage * Do not rely on Python bool conversion magic * request changes * Add test when downloading from gated repo (#25039) * override .cuda() to check if model is already quantized (#25166) * Represent query_length in a different way to solve jit issue (#25164) Fix jit trace * make run_generation more generic for other devices (#25133) * make run_generation more generic for other devices * use Accelerate to support any device type it supports. * make style * fix error usage of accelerator.prepare_model * use `PartialState` to make sure everything is running on the right device --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com> * added compiled model support for inference (#25124) * added compiled model support for inference * linter * Fix tests * linter * linter * remove inference mode from pipelines * Linter --------- Co-authored-by: amarkov <alexander@inworld.ai> * Update `use_auth_token` -> `token` in example scripts (#25167) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`Mpt`] Fix mpt slow test (#25170) fix mpt slow test * [`InstructBlip`] Fix instructblip slow test (#25171) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py * 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881) * docs: ko: transformers_agents.md * docs: ko: transformers_agents.md * feat: deepl draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> --------- Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> * Fix beam search to sample at least 1 non eos token (#25103) (#25115) * [MusicGen] Fix integration tests (#25169) * move to device * update with cuda values * fix fp16 * more rigorous * 🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list * Musicgen: CFG is manually added (#25173) * Better error message in `_prepare_output_docstrings` (#25202) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206) wrap `cuda` and `to` method correctly * Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [quantization.md] fix (#25190) Update quantization.md * [`pipeline`] revisit device check for pipeline (#25207) * revisit device check for pipeline * let's raise an error. * Update tiny model info. and pipeline testing (#25213) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix docker image build failure (#25214) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193) make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * [`Pix2Struct`] Fix pix2struct cross attention (#25200) * fix pix2struct cross attention * fix torchscript slow test * [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well * [`Detr`] Fix detr BatchNorm replacement issue (#25230) * fix detr weird issue * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * fix copies --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Move rescale dtype recasting to match torchvision ToTensor (#25229) Move dtype recasting to match torchvision ToTensor * Fix set of model parallel in the Trainer when no GPUs are available (#25239) * fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to * add pathname and line number to logging formatter in debug mode (#25203) * add pathname and lineno to logging formatter in debug mode * use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno * Add `token` arugment in example scripts (#25172) * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * resolving zero3 init when using accelerate config with Trainer (#25227) * resolving zero3 init when using accelerate config with Trainer * refactor * fix * fix import * Update rescale tests - cast to float after rescaling to reflect #25229 (#25259) Rescale tests - cast to float after rescaling to reflect #25229 * Fix some bugs for two stage training of deformable detr (#25045) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai> * [DOCS] Add example and modified docs of EtaLogitsWarper (#25125) * added example and modified docs for EtaLogitsWarper * make style * fixed styling issue on 544 * removed error info and added set_seed * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updated the results --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix return_dict_in_generate bug in InstructBlip generate function (#25246) Fix bug in InstructBlip generate function Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`). This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object. * Remove `pytest_options={"rA": None}` in CI (#25263) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943) * doc: ko: perf_infer_gpu_many.mdx * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/perf_infer_gpu_many.md Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * recommend DeepSpeed's Argument Parsing documentation (#25268) * [MMS] Fix mms (#25267) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py * CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) * CI with layers=2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * CI with `pytest_num_workers=8` for torch/tf jobs (#25274) n8 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Docs: Update list of `report_to` logging integrations in docstring (#25281) * Update list of logging integrations in docstring Also update type hint * Also add 'flyte' to report_to callback list * Revert 'report_to' type hint update Due to CLI breaking * Update InstructBLIP & Align values after rescale update (#25209) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests * Docs: separate generate section (#25235) Separate generate doc section * Update bark doc (#25234) * add mention to optimization in Bark docs * add offload mention in docs * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update bark docs. * Update bark.md --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add generate method to SpeechT5ForTextToSpeech (#25233) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add timeout parameter to load_image function (#25184) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add parameter to docs. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [JAX] Bump min version (#25286) * [JAX] Bump min version * make fixup * [small] llama2.md typo (#25295) `groupe` -> `grouped` * Fix typo: Roberta -> RoBERTa (#25302) * Move usage of deprecated logging.warn to logging.warning (#25310) The former spelling is deprecated and has been discouraged for a while. The latter spelling seems to be more common in this project anyway, so this change ought to be safe. Fixes https://github.com/huggingface/transformers/issues/25283 * Give more memory in test_disk_offload (#25315) * Generate: get generation mode as an enum (#25292) * Add offline mode for agents (#25226) * Add offline mode for agents * Disable second check too * Deal with nested configs better in base class (#25237) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Document check copies (#25291) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes * Make `bark` could have tiny model (#25290) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Document toc check and doctest check scripts (#25319) * Clean doc toc check and make doctest list better * Add to Makefile * [Whisper] Better error message for outdated generation config (#25298) * Remove jnp.DeviceArray since it is deprecated. (#24875) * Remove jnp.DeviceArray since it is deprecated. * Replace all instances of jnp.DeviceArray with jax.Array * Update src/transformers/models/bert/modeling_flax_bert.py --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add CFG for .generate() (#24654) * 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978) * docs: ko: perf_infer_gpu_one * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: TaeYupNoh <107118671+TaeYupNoh@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: TaeYupNoh <107118671+TaeYupNoh@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update TF pin in docker image (#25343) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generalize CFG to allow for positive prompts (#25339) * Generalize CFG to allow for positive prompts * Add documentation, fix the correct class * Loosen output shape restrictions on GPT-style models (#25188) * Loosen output shape restrictions on GPT-style models * Use more self-explanatory variables * Revert "Use more self-explanatory variables" This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5. * Allow `trust_remote_code` in example scripts (#25248) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5. * fix: duplicated argument * Generate: remove Marian hack (#25294) Remove Marian hack * Fix more offload edge cases (#25342) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Migrate Trainer from `Repository` to `upload_folder` (#25095) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com> * Adding more information in help parser on train_file and validation_file (#25324) chorse: adding new doc on train and val * [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186) * Add Description And Example to Docstring * make style corrections * make style * Doc Style Consistent With HF * Apply make style * Modify Docstring * Edit Type in Docstring * Feedback Incorporated * Edit Docstring * make style * Post Review Changes * Review Feedback Incorporated * Styling * Formatting * make style * pep8 * Docs: Added benchmarks for `torch.compile()` for vision models (#24748) * added benchmarks for compile * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * added more models * added more models fr * added visualizations * minor fix * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added links to models and put charts side by side * Added batch comparisons * Added more comparisons * Fix table * Added link to wheel * Update perf_torch_compile.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add mask2former fp16 support (#25093) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit e5371edabd301cf56079def0421a0a87df307cb0. * Remove dtype conversion noop * [DOCS] Add descriptive docstring to MinNewTokensLength (#25196) * Add descriptive docstring to MinNewTokensLength It addresses https://github.com/huggingface/transformers/issues/24783 * Refine the differences between `min_length` and `min_new_tokens` * Remove extra line * Remove extra arguments in generate * Add a missing space Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run the linter * Add clarification comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization * Fix `test_model_parallelism` (#25359) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add warning for missing attention mask when pad tokens are detected (#25345) * Add attention mask and pad token warning to many of the models * Remove changes under examples/research_projects These files are not maintained by HG. * Skip the warning check during torch.fx or JIT tracing * Switch ordering for the warning and input shape assignment This ordering is a little cleaner for some of the cases. * Add missing line break in one of the files * [ASR Pipeline] Clarify return timestamps (#25344) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment * MaskFormer, Mask2Former - replace einsum for tracing (#25297) * Replace einsum with ops for tracing * Fix comment * Load state in else (#25318) * Load else * New approach * Propagate * Fix `token` in example template (#25351) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Enable tests to run on third-party devcies (#25327) * enable unit tests to run on third-party devcies other than CUDA and CPU. * remove the modification that enabled ut on MPS * control test on third-party device by env variable * update --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com> * 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017) * docs: ko: add_tensorflow_model.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions * fix: manual edits * Fix `torch_job` worker(s) crashing (#25374) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: add config-level validation (#25381) * Fix missing usage of `token` (#25382) * add missing tokens * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Use small config for `OneFormerModelTest.test_model_with_labels` (#25383) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add copied from for image processor methods (#25121) * Add copied from statements for image processors * Move out rescale and normalize to base image processor * Remove rescale and normalize from vit (post rebase) * Update docstrings and tidy up * PR comments * change version (#25387) * [DOCS] Add example for `TopPLogitsWarper` (#25361) * [DOCS] Add example for `TopPLogitsWarper` * fix typo * address review feedback * address review nits * 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923) * docs: ko: perf_train_cpu_many.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * 16059 - Add missing type hints for ASTModel (#25364) * 16059 - Add missing type hints for ASTModel * Add an additional type hint Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * rm useless condition since the previous condition contains it. (#25403) * Fix path for dynamic module creation (#25402) * YOLOS - Revert default return_pixel_mask value (#25404) Revert default return_pixel_mask value * Docs: introduction to generation with LLMs (#25240) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Generate: length validation (#25384) * Improve training args (#25401) * enhanced tips for some training args * make style * Generate: generation config validation fixes in docs (#25405) * 16059 - Add extra type hints for AltCLIPModel (#25399) * Generate: lower severity of parameterization checks (#25407) * VQA task guide (#25244) * initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957) * docs: ko: add_new_model.md * feat: chatgpt draft * fix: manual edits * fix: change document title * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: edit with reviewers Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: add anchor to header * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * fix: edit with reviews * feat: edit toctree --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> * 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625) * docs: ko: model_summary.md * feat: nmt and manual edit model_summary.mdx * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * fix: resolve suggestions2 Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update Bark generation configs and tests (#25409) * update bark generation configs for more coherent parameter * make style * update bark hub repo * aligned sample_beam output selection with beam_search (#25375) * aligned sample_beam specs with beam_search * pull origin main * Revert "pull origin main" This reverts commit 06d356f1137bb52272e120a03636598c44449cf3. * update test_utils.py * fix format * remove comment --------- Co-authored-by: Shogo Fujita <shogo.fujita@legalontech.jp> * Enable passing number of channels when inferring data format (#25412) * Bark: flexible generation config overload (#25414) * [DINOv2] Update pooler output (#25392) Update pooler output * 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010) * docs: ko: philosophy.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Doc checks (#25408) * Document check_dummies * Type hints and doc in other files * Document check inits * Add documentation to * Address review comments * Generation: strict generation config validation at save time (#25411) * strict gen config save; Add tests * add note that the warning will be an exception in v4.34 * [WavLM] Fix Arxiv link and authors (#25415) * [WavLM] Fix Arxiv link and authors * make style * Generate: Load generation config when `device_map` is passed (#25413) * Fix rendering for `torch.compile()` docs (#25432) fix rendering * Add `examples` to tests to run when `setup.py` is modified (#25437) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix issue with ratio evaluation steps and auto find batch size (#25436) * Fully rebased solution * 500 * docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441) Co-authored-by: statelesshz <jihuazhong1@huawei.com> * GPTQ integration (#25062) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * more warning * fix 8bit kwargs tests * peft compatibility * remove var * fix is_gptq_quantized * remove is_gptq_quantized * fix wrap * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add exllama * skip test * overwrite float16 * style * fix skip test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docsting formatting * add doc * better test --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix for #25437 (#25454) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * not debugged code * reference code so nothing is lost * novelty * added docstrings * fixed some relative import errors * fixed small bugs * added linear layers to bloom * removed impossible embedding method * Update src/transformers/models/bloom/desequence_graph_ids.py Co-au…
This commit implements CFG
Fixes #24536 (I did not touch MusicGen)
Hope you enjoy it!
@sanchit-gandhi
@gante