Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1106 commits
Select commit Hold shift + click to select a range
0863eef
[tests] remove `pt_tf` equivalence tests (#36253)
gante Feb 19, 2025
60226c6
TP initialization module-by-module (#35996)
Cyrilvallez Feb 19, 2025
fa8cdcc
[tests] deflake dither test (#36284)
gante Feb 19, 2025
99adc74
[tests] remove flax-pt equivalence and cross tests (#36283)
gante Feb 19, 2025
e3d99ec
[tests] make `test_from_pretrained_low_cpu_mem_usage_equal` less flak…
gante Feb 19, 2025
e5cea20
Add Example for Custom quantization (#36286)
MekkCyber Feb 19, 2025
78d6484
docs: Update README_zh-hans.md (#36269)
hyjbrave Feb 19, 2025
31bb662
Fix callback handler reference (#36250)
SunMarc Feb 19, 2025
5e2183f
Make cache traceable (#35873)
IlyasMoutawwakil Feb 20, 2025
e8531a0
Fix broken CI on release branch due to missing conversion files (#36…
ydshieh Feb 20, 2025
f2ab182
Ignore conversion files in test fetcher (#36251)
ydshieh Feb 20, 2025
4397dfc
SmolVLM2 (#36126)
orrzohar Feb 20, 2025
5412ff1
Fix typo in Pixtral example (#36302)
12v Feb 20, 2025
effaef3
fix: prevent second save in the end of training if last step was save…
NosimusAI Feb 20, 2025
27d1707
[smolvlm] make CI green (#36306)
gante Feb 20, 2025
e18f233
Fix default attention mask of generate in MoshiForConditionalGenerati…
cyan-channel-io Feb 20, 2025
14552cb
VLMs: even more clean-up (#36249)
zucchini-nlp Feb 21, 2025
a957b79
Add SigLIP 2 (#36323)
qubvel Feb 21, 2025
678885b
[CI] Check test if the `GenerationTesterMixin` inheritance is correct…
gante Feb 21, 2025
7c5bd24
[tests] make quanto tests device-agnostic (#36328)
faaany Feb 21, 2025
547911e
Uses Collection in transformers.image_transforms.normalize (#36301)
CalOmnie Feb 21, 2025
92c5ca9
Fix exploitable regexes in Nougat and GPTSan/GPTJNeoXJapanese (#36121)
Rocketknight1 Feb 21, 2025
4dbf17c
[tests] enable bnb tests on xpu (#36233)
faaany Feb 24, 2025
884a8ea
Improve model loading for compressed tensor models (#36152)
rahul-tuli Feb 24, 2025
977a61f
Change slack channel for mi250 CI to amd-hf-ci (#36346)
ivarflakstad Feb 24, 2025
2af272c
Add autoquant support for torchao quantizer (#35503)
jerryzh168 Feb 24, 2025
f4684a6
Update amd pytorch index to match base image (#36347)
ivarflakstad Feb 24, 2025
18276b0
fix(type): padding_side type should be Optional[str] (#36326)
shenxiangzhuang Feb 24, 2025
05dfed0
[Modeling] Reduce runtime when loading missing keys (#36312)
kylesayrs Feb 24, 2025
2ab7bdc
notify new model merged to `main` (#36375)
ydshieh Feb 24, 2025
931e5f4
Update modeling_llava_onevision.py (#36391)
yinsong1986 Feb 25, 2025
4b5cf54
Load models much faster on accelerator devices!! (#36380)
Cyrilvallez Feb 25, 2025
bc65f3f
[modular] Do not track imports in functions (#36279)
Cyrilvallez Feb 25, 2025
401543a
Fix `is_causal` fail with compile (#36374)
Cyrilvallez Feb 25, 2025
9d6abf9
enable torchao quantization on CPU (#36146)
jiqing-feng Feb 25, 2025
92abc0d
Update _get_eval_sampler to reflect Trainer.tokenizer is deprecation …
yukiman76 Feb 25, 2025
da4ab2a
Fix doc formatting in forward passes & modular (#36243)
Cyrilvallez Feb 25, 2025
3a02fe5
Added handling for length <2 of suppress_tokens for whisper (#36336)
andreystarenky Feb 25, 2025
d80d52b
addressing the issue #34611 to make FlaxDinov2 compatible with any ba…
MHRDYN7 Feb 25, 2025
b4b9da6
tests: revert change of torch_require_multi_gpu to be device agnostic…
dvrogozh Feb 25, 2025
c3700b0
[tests] enable autoawq tests on XPU (#36327)
faaany Feb 25, 2025
7c8916d
fix audio classification pipeline fp16 test on cuda (#36359)
jiqing-feng Feb 25, 2025
ca6ebcb
chore: fix function argument descriptions (#36392)
threewebcode Feb 25, 2025
fb83bef
Fix pytorch integration tests for SAM (#36397)
qubvel Feb 25, 2025
e1ce948
[CLI] add import guards (#36376)
gante Feb 25, 2025
88d1051
Fix convert_to_rgb for SAM ImageProcessor (#36369)
MSt-10 Feb 25, 2025
cbe0ea5
Security fix for `benchmark.yml` (#36402)
ydshieh Feb 25, 2025
9ebfda3
Fixed VitDet for non-squre Images (#35969)
cjfghk5697 Feb 25, 2025
41925e4
Add retry hf hub decorator (#35213)
muellerzr Feb 25, 2025
9a217fc
Deprecate transformers.agents (#36415)
aymeric-roucher Feb 26, 2025
b4965ce
Fixing the docs corresponding to the breaking change in torch 2.6. (#…
Narsil Feb 26, 2025
6513e5e
add recommendations for NPU using flash_attn (#36383)
zheliuyu Feb 26, 2025
082834d
fix: prevent model access error during Optuna hyperparameter tuning (…
emapco Feb 26, 2025
d18d9c3
Universal Speculative Decoding `CandidateGenerator` (#35029)
keyboardAnt Feb 26, 2025
981c276
Fix compressed tensors config (#36421)
MekkCyber Feb 26, 2025
1603018
Update form pretrained to make TP a first class citizen (#36335)
ArthurZucker Feb 26, 2025
a7fbab3
Fix Expected output for compressed-tensors tests (#36425)
MekkCyber Feb 26, 2025
8ede897
restrict cache allocator to non quantized model (#36428)
SunMarc Feb 26, 2025
d0727d9
Change PR to draft when it is (re)opened (#36417)
ydshieh Feb 27, 2025
a8e4fe4
Fix permission (#36443)
ydshieh Feb 27, 2025
549db24
Fix another permission (#36444)
ydshieh Feb 27, 2025
2d6cc0d
Add `contents: write` (#36445)
ydshieh Feb 27, 2025
1779255
[save_pretrained ] Skip collecting duplicated weight (#36409)
wejoncy Feb 27, 2025
8aed019
[generate] `torch.distributed`-compatible `DynamicCache` (#36373)
gante Feb 27, 2025
6a87646
Lazy import libraries in `src/transformers/image_utils.py` (#36435)
hmellor Feb 27, 2025
482d17b
Fix `hub_retry` (#36449)
ydshieh Feb 27, 2025
222505c
[GroundingDino] Fix grounding dino loss 🚨 (#31828)
EduardoPach Feb 27, 2025
02776d2
Fix loading models with mismatched sizes (#36463)
qubvel Feb 28, 2025
51083d1
[docs] fix bug in deepspeed config (#36081)
faaany Feb 28, 2025
2c5d038
Add Got-OCR 2 Fast image processor and refactor slow one (#36185)
yonigozlan Mar 1, 2025
a40f1ac
Fix couples of issues from #36335 (#36453)
SunMarc Mar 1, 2025
dcbdf7e
Fix _load_state_dict_into_meta_model with device_map=None (#36488)
hlky Mar 2, 2025
4d8259d
Fix loading zero3 weights (#36455)
muellerzr Mar 3, 2025
9e3a107
Check `TRUST_REMOTE_CODE` for `RealmRetriever` for security (#36511)
ydshieh Mar 3, 2025
3e83ee7
Fix kwargs UserWarning in SamImageProcessor (#36479)
MSt-10 Mar 3, 2025
0463901
fix torch_dtype, contiguous, and load_state_dict regression (#36512)
SunMarc Mar 3, 2025
acb8586
Fix some typos in docs (#36502)
co63oc Mar 3, 2025
28159ae
chore: fix message descriptions in arguments and comments (#36504)
threewebcode Mar 3, 2025
2aff938
Fix pipeline+peft interaction (#36480)
Rocketknight1 Mar 3, 2025
1975be4
Fix edge case for continue_final_message (#36404)
Rocketknight1 Mar 3, 2025
9fe8279
[Style] fix E721 warnings (#36474)
kashif Mar 3, 2025
6aa9888
Remove unused code (#36459)
Rocketknight1 Mar 3, 2025
c0f8d05
[docs] Redesign (#31757)
stevhliu Mar 3, 2025
84f0186
Add aya (#36521)
ArthurZucker Mar 4, 2025
3750881
chore: Fix typos in docs and examples (#36524)
co63oc Mar 4, 2025
c0c5acf
Fix bamba tests amd (#36535)
ivarflakstad Mar 4, 2025
89d27fa
Fix links in quantization doc (#36528)
MekkCyber Mar 4, 2025
66f29aa
chore: enhance messages in docstrings (#36525)
threewebcode Mar 4, 2025
752ef3f
guard torch version for uint16 (#36520)
SunMarc Mar 5, 2025
996f512
Fix typos in tests (#36547)
co63oc Mar 5, 2025
6966fa1
Fix typos . (#36551)
zhanluxianshen Mar 6, 2025
9e84b38
chore: enhance message descriptions in parameters,comments,logs and d…
threewebcode Mar 6, 2025
acc49e3
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_project…
dependabot[bot] Mar 6, 2025
9e38510
Delete redundancy if case in model_utils (#36559)
zhanluxianshen Mar 6, 2025
bc30dd1
Modular Conversion --fix_and_overwrite on Windows (#36583)
hlky Mar 6, 2025
0440dbc
Integrate SwanLab for offline/online experiment tracking and local vi…
ShaohonChen Mar 6, 2025
c1b24c0
[bark] fix loading of generation config (#36587)
gante Mar 6, 2025
5275ef6
[XGLM] tag tests as slow (#36592)
gante Mar 6, 2025
159445d
fix: argument (#36558)
ariG23498 Mar 6, 2025
51ed61e
Mention UltraScale Playbook 🌌 in docs (#36589)
NouamaneTazi Mar 6, 2025
6f77597
avoid errors when the size of `input_ids` passed to `PrefixConstraine…
HiDolen Mar 7, 2025
8a16edc
Export base streamer. (#36500)
AndreasAbdi Mar 7, 2025
f2e197c
Github action for auto-assigning reviewers (#35846)
Rocketknight1 Mar 7, 2025
1b9978c
Update chat_extras.md with content correction (#36599)
krishkkk Mar 7, 2025
f2fb419
Update "who to tag" / "who can review" (#36394)
gante Mar 7, 2025
4fce7a0
Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decisi…
dependabot[bot] Mar 7, 2025
a1cf9f3
Fixed datatype related issues in `DataCollatorForLanguageModeling` (#…
capemox Mar 7, 2025
94ae1ba
Fix check for XPU. PyTorch >= 2.6 no longer needs ipex. (#36593)
tripzero Mar 7, 2025
8585450
[`HybridCache`] disable automatic compilation (#36620)
gante Mar 10, 2025
a929c46
Fix auto-assign reviewers (#36631)
Rocketknight1 Mar 10, 2025
af9b2ea
chore: fix typos in language models (#36586)
threewebcode Mar 10, 2025
e9756cd
[docs] Serving LLMs (#36522)
stevhliu Mar 10, 2025
1c4b62b
Refactor some core stuff (#36539)
ArthurZucker Mar 11, 2025
d8663cb
Fix bugs in mllama image processing (#36156)
tjohnson31415 Mar 11, 2025
d126f35
Proper_flex (#36643)
ArthurZucker Mar 11, 2025
b1a51ea
Fix AriaForConditionalGeneration flex attn test (#36604)
ivarflakstad Mar 11, 2025
556d2c2
Remove remote code warning (#36285)
Rocketknight1 Mar 11, 2025
b80b3ec
Stop warnings from unnecessary torch.tensor() overuse (#36538)
Rocketknight1 Mar 11, 2025
ed1807b
[docs] Update docs dependency (#36635)
stevhliu Mar 11, 2025
1e4286f
Remove research projects (#36645)
Rocketknight1 Mar 11, 2025
cb384dc
Fix gguf docs (#36601)
SunMarc Mar 11, 2025
81aa9b2
fix typos in the docs directory (#36639)
threewebcode Mar 11, 2025
50d3530
Gemma3 (#36658)
RyanMullins Mar 12, 2025
89f6956
HPU support (#36424)
IlyasMoutawwakil Mar 12, 2025
2829013
fix block mask typing (#36661)
ArthurZucker Mar 12, 2025
994cad2
[CI] gemma 3 `make fix-copies` (#36664)
gante Mar 12, 2025
7652804
Fix bnb regression due to empty state dict (#36663)
SunMarc Mar 12, 2025
071a161
[core] Large/full refactor of `from_pretrained` (#36033)
Cyrilvallez Mar 12, 2025
c7eb955
Don't accidentally mutate the base_model_tp_plan (#36677)
Rocketknight1 Mar 12, 2025
0013ba6
Fix Failing GPTQ tests (#36666)
MekkCyber Mar 12, 2025
bc3253f
Remove hardcoded slow image processor class in processors supporting …
yonigozlan Mar 12, 2025
cc3a361
[quants] refactor logic for modules_to_not_convert (#36672)
SunMarc Mar 12, 2025
ea219ed
Remove differences between init and preprocess kwargs for fast image …
yonigozlan Mar 12, 2025
48292a9
Refactor siglip2 fast image processor (#36406)
yonigozlan Mar 13, 2025
79254c9
Fix rescale normalize inconsistencies in fast image processors (#36388)
yonigozlan Mar 13, 2025
c416123
[Cache] Don't initialize the cache on `meta` device (#36543)
gante Mar 13, 2025
fbb18ce
Update config.torch_dtype correctly (#36679)
SunMarc Mar 13, 2025
bc3d578
Fix slicing for 0-dim param (#36580)
SunMarc Mar 13, 2025
47cc4da
Changing the test model in Quanto kv cache (#36670)
MekkCyber Mar 13, 2025
87b30c3
fix wandb hp search unable to resume from sweep_id (#35883)
bd793fcb Mar 13, 2025
65b8e38
Upgrading torch version and cuda version in quantization docker (#36264)
MekkCyber Mar 13, 2025
1c287ae
Change Qwen2_VL image processors to have init and call accept the sam…
yonigozlan Mar 13, 2025
bb965d8
fix type annotation for ALL_ATTENTION_FUNCTIONS (#36690)
WineChord Mar 13, 2025
32c95bd
Fix dtype for params without tp_plan (#36681)
Cyrilvallez Mar 13, 2025
d845693
chore: fix typos in utils module (#36668)
threewebcode Mar 13, 2025
a3201ce
[CI] Automatic rerun of certain test failures (#36694)
gante Mar 13, 2025
2a004f9
Add loading speed test (#36671)
Cyrilvallez Mar 13, 2025
09a309d
fix: fsdp sharded state dict wont work for save_only_model knob (#36627)
kmehant Mar 13, 2025
4a60bae
Handling an exception related to HQQ quantization in modeling (#36702)
MekkCyber Mar 13, 2025
b070025
Add GGUF support to T5-Encoder (#36700)
Isotr0py Mar 13, 2025
48ef468
Final CI cleanup (#36703)
Rocketknight1 Mar 13, 2025
69bc848
Add support for fast image processors in add-new-model-like CLI (#36313)
yonigozlan Mar 13, 2025
53742b1
Gemma3 processor typo (#36710)
Kuangdd01 Mar 14, 2025
72861e1
Make the flaky list a little more general (#36704)
Rocketknight1 Mar 14, 2025
8cb522b
Cleanup the regex used for doc preprocessing (#36648)
Rocketknight1 Mar 14, 2025
3bd1a0d
[model loading] don't `gc.collect()` if only 1 shard is used (#36721)
gante Mar 14, 2025
691d1b5
Fix/best model checkpoint fix (#35885)
seanswyi Mar 14, 2025
9215cc6
Try working around the processor registration bugs (#36184)
Rocketknight1 Mar 14, 2025
42ebb6c
[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650)
gante Mar 14, 2025
25992b4
🌐 [i18n-KO] Translated codegen.md to Korean (#36698)
maximizemaxwell Mar 14, 2025
2c2495c
Fix post_init() code duplication (#36727)
Cyrilvallez Mar 14, 2025
6f3e0b6
Fix grad accum arbitrary value (#36691)
IlyasMoutawwakil Mar 14, 2025
f263e88
Update self-push-caller.yml
glegendre01 Mar 15, 2025
fc8764c
[Generation, Gemma 3] When passing a custom `generation_config`, over…
gante Mar 15, 2025
c53d53d
🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422)
geetu040 Mar 17, 2025
9e94801
enable/disable compile for quants methods (#36519)
SunMarc Mar 17, 2025
2256875
fix can_generate (#36570)
jiqing-feng Mar 17, 2025
da7d64f
Allow ray datasets to be used with trainer (#36699)
FredrikNoren Mar 17, 2025
27361bd
fix xpu tests (#36656)
jiqing-feng Mar 17, 2025
8e67230
Fix test isolation for clear_import_cache utility (#36345)
sambhavnoobcoder Mar 17, 2025
c8a2b25
Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734)
pkuderov Mar 17, 2025
e3af4fe
[MINOR:TYPO] Update hubert.md (#36733)
cakiki Mar 17, 2025
cff4caa
[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (…
gante Mar 17, 2025
ac1a1b6
[docs] Update README (#36265)
stevhliu Mar 17, 2025
cbfb8d7
doc: Clarify `is_decoder` usage in PretrainedConfig documentation (#3…
d-kleine Mar 17, 2025
7f5077e
fix typos in the tests directory (#36717)
threewebcode Mar 17, 2025
19b9d8a
chore: fix typos in tests directory (#36785)
threewebcode Mar 18, 2025
7426d02
Fixing typo in gemma3 image_processor_fast and adding a small test (#…
Zebz13 Mar 18, 2025
bd92073
Fix gemma3_text tokenizer in mapping (#36793)
LysandreJik Mar 18, 2025
e959530
Add Mistral3 (#36790)
Cyrilvallez Mar 18, 2025
3017536
fix hqq due to recent modeling changes (#36771)
SunMarc Mar 18, 2025
7baf000
Update SHA for `tj-actions/changed-files` (#36795)
ydshieh Mar 18, 2025
db1d4c5
Loading optimizations (#36742)
Cyrilvallez Mar 18, 2025
30580f0
Fix Mistral3 tests (#36797)
yonigozlan Mar 18, 2025
14b597f
Fix casting dtype for qunatization (#36799)
SunMarc Mar 18, 2025
00915d3
Fix chameleon's TypeError because inputs_embeds may None (#36673)
YenFuLin Mar 18, 2025
12f2ebe
Support custom dosctrings in modular (#36726)
yonigozlan Mar 18, 2025
179d02f
[generate] ✨ vectorized beam search ✨ (#35802)
gante Mar 18, 2025
706703b
Expectations test utils (#36569)
ivarflakstad Mar 18, 2025
4fa91b1
fix "Cannot copy out of meta tensor; no data!" issue for BartForCondi…
yao-matrix Mar 19, 2025
b9374a0
Remove `dist": "loadfile"` for `pytest` in CircleCI jobs (#36811)
ydshieh Mar 19, 2025
a861db0
Fix Device map for bitsandbytes tests (#36800)
MekkCyber Mar 19, 2025
0fe0bae
[Generation] remove leftover code from end-to-end compilation (#36685)
gante Mar 19, 2025
fef8b7f
Add attention visualization tool (#36630)
ArthurZucker Mar 19, 2025
e8d9603
Add option for ao base configs (#36526)
drisspg Mar 19, 2025
b11050d
enable OffloadedCache on XPU from PyTorch 2.7 (#36654)
yao-matrix Mar 19, 2025
7c23398
[gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741)
gante Mar 19, 2025
63c3116
One more fix for reviewer assignment (#36829)
Rocketknight1 Mar 19, 2025
f39f496
Support tracable dynamicKVcache (#36311)
tugsbayasgalan Mar 19, 2025
258dd9c
Add Space to Bitsandbytes doc (#36834)
MekkCyber Mar 19, 2025
107fedc
quick fix fast_image_processor register error (#36716)
JJJYmmm Mar 19, 2025
51bd0ce
Update configuration_qwen2.py (#36735)
michaelfeil Mar 19, 2025
9be4728
Just import torch AdamW instead (#36177)
Rocketknight1 Mar 19, 2025
b815fae
Move the warning to the documentation for DataCollatorWithFlattening …
qgallouedec Mar 20, 2025
8733297
Fix swanlab global step (#36728)
Zeyi-Lin Mar 20, 2025
9455543
Disable inductor config setter by default (#36608)
HDCharles Mar 20, 2025
8f64b17
[ForCausalLMLoss] allow users to pass shifted labels (#36607)
stas00 Mar 20, 2025
3f03c37
fix tiktoken convert to pass AddedToken to Tokenizer (#36566)
itazap Mar 20, 2025
8b479e3
Saving `Trainer.collator.tokenizer` in when `Trainer.processing_class…
Mar 20, 2025
e7337ee
Pass num_items_in_batch directly to loss computation (#36753)
eljandoubi Mar 20, 2025
1ddb649
Fix fp16 ONNX export for RT-DETR and RT-DETRv2 (#36460)
qubvel Mar 20, 2025
f0d5b2f
Update deprecated Jax calls (#35919)
rasmi Mar 20, 2025
957b05b
[qwen2 audio] remove redundant code and update docs (#36282)
gante Mar 20, 2025
63380b7
Pass state dict (#35234)
phos-phophy Mar 20, 2025
8e97b44
[modular] Sort modular skips (#36304)
gante Mar 20, 2025
b47d9b2
[generate] clarify docstrings: when to inherit `GenerationMixin` (#36…
gante Mar 20, 2025
388e665
Update min safetensors bis (#36823)
SunMarc Mar 20, 2025
cf8091c
Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" (#3…
qubvel Mar 20, 2025
8124a23
Gemma 3: Adding explicit GenerationConfig and refactoring conversion …
RyanMullins Mar 20, 2025
a63e92e
Fix: remove the redundant snippet of _whole_word_mask (#36759)
HuangBugWei Mar 20, 2025
487dab1
Shieldgemma2 (#36678)
RyanMullins Mar 20, 2025
055afdb
Fix ONNX export for sequence classification head (#36332)
echarlaix Mar 20, 2025
3e8f0fb
Fix hqq skipped modules and dynamic quant (#36821)
mobicham Mar 20, 2025
ce091b1
Use pyupgrade --py39-plus to improve code (#36843)
cyyever Mar 20, 2025
1a37479
Support loading Quark quantized models in Transformers (#36372)
fxmarty-amd Mar 20, 2025
730d2a5
DeepSpeed tensor parallel+ZeRO (#36825)
inkcherry Mar 20, 2025
6629177
Refactor Attention implementation for ViT-based models (#36545)
qubvel Mar 20, 2025
6515c25
Add Prompt Depth Anything Model (#35401)
haotongl Mar 20, 2025
1d3f35f
Add model visual debugger (#36798)
molbap Mar 20, 2025
068b663
[torchao] revert to get_apply_tensor_subclass (#36849)
SunMarc Mar 20, 2025
42c489f
Gemma3: fix test (#36820)
zucchini-nlp Mar 20, 2025
ecd60d0
[CI] fix update metadata job (#36850)
gante Mar 20, 2025
9e771bf
Add support for seed in `DataCollatorForLanguageModeling` (#36497)
capemox Mar 20, 2025
6a26279
Refactor Aya Vision with modular (#36688)
yonigozlan Mar 20, 2025
97d2f9d
Mllama: raise better error (#35934)
zucchini-nlp Mar 21, 2025
949cca4
[CI] doc builder without custom image (#36862)
gante Mar 21, 2025
6bb8565
FIX FSDP plugin update for QLoRA (#36720)
BenjaminBossan Mar 21, 2025
0adbc87
Remove call to `.item` in `get_batch_samples` (#36861)
regisss Mar 21, 2025
26c8349
chore: fix typos in the tests directory (#36813)
threewebcode Mar 21, 2025
62116c9
Make ViTPooler configurable (#36517)
sebbaur Mar 21, 2025
f19d018
Revert "Update deprecated Jax calls (#35919)" (#36880)
ArthurZucker Mar 21, 2025
94f4876
[generate] model defaults being inherited only happens for newer mode…
gante Mar 21, 2025
6321876
add eustlb as an actor
ArthurZucker Mar 21, 2025
b8aadc3
:red_circle: :red_circle: :red_circle: supersede paligemma forward to…
molbap Mar 21, 2025
2638d54
Gemma 3 tests expect greedy decoding (#36882)
molbap Mar 21, 2025
f94b0c5
Use `deformable_detr` kernel from the Hub (#36853)
danieldk Mar 21, 2025
3f9ff19
Minor Gemma 3 fixes (#36884)
molbap Mar 21, 2025
523f6e7
Fix: dtype cannot be str (#36262)
zucchini-nlp Mar 21, 2025
26fbd69
v 4.50.0
ArthurZucker Mar 20, 2025
0b057e6
fix import issue
ArthurZucker Mar 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
19 changes: 13 additions & 6 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ jobs:
check_circleci_user:
docker:
- image: python:3.10-slim
resource_class: small
parallelism: 1
steps:
- run: echo $CIRCLE_PROJECT_USERNAME
Expand All @@ -30,6 +31,14 @@ jobs:
parallelism: 1
steps:
- checkout
- run: if [[ "$CIRCLE_PULL_REQUEST" == "" && "$CIRCLE_BRANCH" != "main" && "$CIRCLE_BRANCH" != *-release ]]; then echo "Not a PR, not the main branch and not a release branch, skip test!"; circleci-agent step halt; fi
- run: 'curl -L -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" https://api.github.com/repos/$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME/pulls/${CIRCLE_PULL_REQUEST##*/} >> github.txt'
- run: cat github.txt
- run: (python3 -c 'import json; from datetime import datetime; fp = open("github.txt"); data = json.load(fp); fp.close(); f = "%Y-%m-%dT%H:%M:%SZ"; created = datetime.strptime(data["created_at"], f); updated = datetime.strptime(data["updated_at"], f); s = (updated - created).total_seconds(); print(int(s))' || true) > elapsed.txt
- run: if [ "$(cat elapsed.txt)" == "" ]; then echo 60 > elapsed.txt; fi
- run: cat elapsed.txt
- run: if [ "$(cat elapsed.txt)" -lt "30" ]; then echo "PR is just opened, wait some actions from GitHub"; sleep 30; fi
- run: 'if grep -q "\"draft\": true," github.txt; then echo "draft mode, skip test!"; circleci-agent step halt; fi'
- run: uv pip install -U -e .
- run: echo 'export "GIT_COMMIT_MESSAGE=$(git show -s --format=%s)"' >> "$BASH_ENV" && source "$BASH_ENV"
- run: mkdir -p test_preparation
Expand Down Expand Up @@ -57,15 +66,15 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py

# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
# We used:

# https://circleci.com/docs/api/v2/index.html#operation/getJobArtifacts : to get the job artifacts
# We could not pass a nested dict, which is why we create the test_file_... parameters for every single job

- store_artifacts:
path: test_preparation/transformed_artifacts.json
- store_artifacts:
Expand Down Expand Up @@ -109,7 +118,7 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py

# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
Expand Down Expand Up @@ -170,7 +179,6 @@ jobs:
path: ~/transformers/installed.txt
- run: python utils/check_copies.py
- run: python utils/check_modular_conversion.py
- run: python utils/check_table.py
- run: python utils/check_dummies.py
- run: python utils/check_repo.py
- run: python utils/check_inits.py
Expand All @@ -180,7 +188,6 @@ jobs:
- run: make deps_table_check_updated
- run: python utils/update_metadata.py --check-only
- run: python utils/check_docstrings.py
- run: python utils/check_support_list.py

workflows:
version: 2
Expand Down
108 changes: 66 additions & 42 deletions .circleci/create_circleci_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,21 +28,52 @@
"TRANSFORMERS_IS_CI": True,
"PYTEST_TIMEOUT": 120,
"RUN_PIPELINE_TESTS": False,
"RUN_PT_TF_CROSS_TESTS": False,
"RUN_PT_FLAX_CROSS_TESTS": False,
}
# Disable the use of {"s": None} as the output is way too long, causing the navigation on CircleCI impractical
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "dist": "loadfile", "vvv": None, "rsf":None}
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "vvv": None, "rsfE":None}
DEFAULT_DOCKER_IMAGE = [{"image": "cimg/python:3.8.12"}]

# Strings that commonly appear in the output of flaky tests when they fail. These are used with `pytest-rerunfailures`
# to rerun the tests that match these patterns.
FLAKY_TEST_FAILURE_PATTERNS = [
"OSError", # Machine/connection transient error
"Timeout", # Machine/connection transient error
"ConnectionError", # Connection transient error
"FileNotFoundError", # Raised by `datasets` on Hub failures
"PIL.UnidentifiedImageError", # Raised by `PIL.Image.open` on connection issues
"HTTPError", # Also catches HfHubHTTPError
"AssertionError: Tensor-likes are not close!", # `torch.testing.assert_close`, we might have unlucky random values
# TODO: error downloading tokenizer's `merged.txt` from hub can cause all the exceptions below. Throw and handle
# them under a single message.
"TypeError: expected str, bytes or os.PathLike object, not NoneType",
"TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType",
"Converting from Tiktoken failed",
"KeyError: <class ",
"TypeError: not a string",
]


class EmptyJob:
job_name = "empty"

def to_dict(self):
steps = [{"run": 'ls -la'}]
if self.job_name == "collection_job":
steps.extend(
[
"checkout",
{"run": "pip install requests || true"},
{"run": """while [[ $(curl --location --request GET "https://circleci.com/api/v2/workflow/$CIRCLE_WORKFLOW_ID/job" --header "Circle-Token: $CCI_TOKEN"| jq -r '.items[]|select(.name != "collection_job")|.status' | grep -c "running") -gt 0 ]]; do sleep 5; done || true"""},
{"run": 'python utils/process_circleci_workflow_test_reports.py --workflow_id $CIRCLE_WORKFLOW_ID || true'},
{"store_artifacts": {"path": "outputs"}},
{"run": 'echo "All required jobs have now completed"'},
]
)

return {
"docker": copy.deepcopy(DEFAULT_DOCKER_IMAGE),
"steps":["checkout"],
"resource_class": "small",
"steps": steps,
}


Expand All @@ -54,9 +85,9 @@ class CircleCIJob:
install_steps: List[str] = None
marker: Optional[str] = None
parallelism: Optional[int] = 0
pytest_num_workers: int = 12
pytest_num_workers: int = 8
pytest_options: Dict[str, Any] = None
resource_class: Optional[str] = "2xlarge"
resource_class: Optional[str] = "xlarge"
tests_to_run: Optional[List[str]] = None
num_test_files_per_worker: Optional[int] = 10
# This should be only used for doctest job!
Expand Down Expand Up @@ -112,7 +143,9 @@ def to_dict(self):
# Examples special case: we need to download NLTK files in advance to avoid cuncurrency issues
timeout_cmd = f"timeout {self.command_timeout} " if self.command_timeout else ""
marker_cmd = f"-m '{self.marker}'" if self.marker is not None else ""
additional_flags = f" -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"
junit_flags = f" -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"
joined_flaky_patterns = "|".join(FLAKY_TEST_FAILURE_PATTERNS)
repeat_on_failure_flags = f"--reruns 5 --reruns-delay 2 --only-rerun '({joined_flaky_patterns})'"
parallel = f' << pipeline.parameters.{self.job_name}_parallelism >> '
steps = [
"checkout",
Expand All @@ -133,14 +166,14 @@ def to_dict(self):
"command": """dpkg-query --show --showformat='${Installed-Size}\t${Package}\n' | sort -rh | head -25 | sort -h | awk '{ package=$2; sub(".*/", "", package); printf("%.5f GB %s\n", $1/1024/1024, package)}' || true"""}
},
{"run": {"name": "Create `test-results` directory", "command": "mkdir test-results"}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>>' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>> --header "Circle-Token: $CIRCLE_TOKEN"' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Split tests across parallel nodes: show current parallel tests",
"command": f"TESTS=$(circleci tests split --split-by=timings {self.job_name}_test_list.txt) && echo $TESTS > splitted_tests.txt && echo $TESTS | tr ' ' '\n'" if self.parallelism else f"awk '{{printf \"%s \", $0}}' {self.job_name}_test_list.txt > splitted_tests.txt"
}
},
{"run": {
"name": "Run tests",
"command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {additional_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}
"command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {junit_flags} {repeat_on_failure_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}
},
{"run": {"name": "Expand to show skipped tests", "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --skip"}},
{"run": {"name": "Failed tests: show reasons", "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --fail"}},
Expand All @@ -163,66 +196,45 @@ def job_name(self):


# JOBS
torch_and_tf_job = CircleCIJob(
"torch_and_tf",
docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],
additional_env={"RUN_PT_TF_CROSS_TESTS": True},
marker="is_pt_tf_cross_test",
pytest_options={"rA": None, "durations": 0},
)


torch_and_flax_job = CircleCIJob(
"torch_and_flax",
additional_env={"RUN_PT_FLAX_CROSS_TESTS": True},
docker_image=[{"image":"huggingface/transformers-torch-jax-light"}],
marker="is_pt_flax_cross_test",
pytest_options={"rA": None, "durations": 0},
)

torch_job = CircleCIJob(
"torch",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="not generate",
parallelism=6,
pytest_num_workers=8
)

generate_job = CircleCIJob(
"generate",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="generate",
parallelism=6,
pytest_num_workers=8
)

tokenization_job = CircleCIJob(
"tokenization",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=16
)

processor_job = CircleCIJob(
"processors",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=6
)

tf_job = CircleCIJob(
"tf",
docker_image=[{"image":"huggingface/transformers-tf-light"}],
parallelism=6,
pytest_num_workers=16,
)


flax_job = CircleCIJob(
"flax",
docker_image=[{"image":"huggingface/transformers-jax-light"}],
parallelism=6,
pytest_num_workers=16
pytest_num_workers=16,
resource_class="2xlarge",
)


Expand All @@ -231,7 +243,7 @@ def job_name(self):
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-torch-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)


Expand All @@ -240,7 +252,7 @@ def job_name(self):
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-tf-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)


Expand All @@ -257,15 +269,15 @@ def job_name(self):
docker_image=[{"image":"huggingface/transformers-examples-torch"}],
# TODO @ArthurZucker remove this once docker is easier to build
install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
pytest_num_workers=8,
pytest_num_workers=4,
)


examples_tensorflow_job = CircleCIJob(
"examples_tensorflow",
additional_env={"OMP_NUM_THREADS": 8},
docker_image=[{"image":"huggingface/transformers-examples-tf"}],
pytest_num_workers=16,
pytest_num_workers=2,
)


Expand All @@ -280,6 +292,7 @@ def job_name(self):
],
marker="is_staging_test",
pytest_num_workers=2,
resource_class="medium",
)


Expand All @@ -292,13 +305,13 @@ def job_name(self):
],
pytest_options={"k onnx": None},
pytest_num_workers=1,
resource_class="small",
)


exotic_models_job = CircleCIJob(
"exotic_models",
docker_image=[{"image":"huggingface/transformers-exotic-models"}],
pytest_num_workers=12,
parallelism=4,
pytest_options={"durations": 100},
)
Expand All @@ -317,7 +330,6 @@ def job_name(self):
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="not generate",
parallelism=6,
pytest_num_workers=8,
)


Expand Down Expand Up @@ -345,13 +357,14 @@ def job_name(self):
pytest_num_workers=1,
)

REGULAR_TESTS = [torch_and_tf_job, torch_and_flax_job, torch_job, tf_job, flax_job, hub_job, onnx_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip
REGULAR_TESTS = [torch_job, tf_job, flax_job, hub_job, onnx_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip
EXAMPLES_TESTS = [examples_torch_job, examples_tensorflow_job]
PIPELINE_TESTS = [pipelines_torch_job, pipelines_tf_job]
REPO_UTIL_TESTS = [repo_utils_job]
DOC_TESTS = [doc_test_job]
ALL_TESTS = REGULAR_TESTS + EXAMPLES_TESTS + PIPELINE_TESTS + REPO_UTIL_TESTS + DOC_TESTS + [custom_tokenizers_job] + [exotic_models_job] # fmt: skip


def create_circleci_config(folder=None):
if folder is None:
folder = os.getcwd()
Expand All @@ -361,7 +374,13 @@ def create_circleci_config(folder=None):

if len(jobs) == 0:
jobs = [EmptyJob()]
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
else:
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
# Add a job waiting all the test jobs and aggregate their test summary files at the end
collection_job = EmptyJob()
collection_job.job_name = "collection_job"
jobs = [collection_job] + jobs

config = {
"version": "2.1",
"parameters": {
Expand All @@ -371,9 +390,14 @@ def create_circleci_config(folder=None):
**{j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs},
**{j.job_name + "_parallelism":{"type":"integer", "default":1} for j in jobs},
},
"jobs" : {j.job_name: j.to_dict() for j in jobs},
"workflows": {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
"jobs": {j.job_name: j.to_dict() for j in jobs}
}
if "CIRCLE_TOKEN" in os.environ:
# For private forked repo. (e.g. new model addition)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [{j.job_name: {"context": ["TRANSFORMERS_CONTEXT"]}} for j in jobs]}}
else:
# For public repo. (e.g. `transformers`)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
with open(os.path.join(folder, "generated_config.yml"), "w") as f:
f.write(yaml.dump(config, sort_keys=False, default_flow_style=False).replace("' << pipeline", " << pipeline").replace(">> '", " >>"))

Expand Down
7 changes: 4 additions & 3 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@ body:

- text models: @ArthurZucker
- vision models: @amyeroberts, @qubvel
- speech models: @ylacombe, @eustlb
- speech models: @eustlb
- graph models: @clefourrier

Library:

- flax: @sanchit-gandhi
- flax: @gante and @Rocketknight1
- generate: @zucchini-nlp (visual-language models) or @gante (all others)
- pipelines: @Rocketknight1
- tensorflow: @gante and @Rocketknight1
Expand Down Expand Up @@ -72,7 +72,7 @@ body:

Maintained examples (not research project or legacy):

- Flax: @sanchit-gandhi
- Flax: @Rocketknight1
- PyTorch: See Models above and tag the person corresponding to the modality of the example.
- TensorFlow: @Rocketknight1

Expand Down Expand Up @@ -106,6 +106,7 @@ body:
label: Reproduction
description: |
Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
Please include relevant config information with your code, for example your Trainers, TRL, Peft, and DeepSpeed configs.
If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
Expand Down
Loading
Loading