huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 27.8k
Star 138k

Code
Issues 1k
Pull requests 565
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: huggingface/transformers

Labels 129 Milestones 0

New pull request New

Clear current search query, filters, and sorts

115 Open 12,250 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add Qwen2VLImageProcessorFast into Qwen2VLProcessor

#35987 opened Jan 31, 2025 by yeliudev

Loading…

1 of 5 tasks

Display warning for unknown quants config instead of an error

#35963 opened Jan 29, 2025 by SunMarc

Loading…

Chat template: update for processor

#35953 opened Jan 29, 2025 by zucchini-nlp

Loading…

Add support for partial rotary embeddings in Phi3 model

#35947 opened Jan 28, 2025 by garg-amit

Loading…

1 of 5 tasks

layernorm_decay_fix

#35927 opened Jan 28, 2025 by Ryoo72

Loading…

3 of 5 tasks

Fix usage of unpad_input function

#35925 opened Jan 28, 2025 by pavelgein

Loading…

1 of 5 tasks

Fix how we compute the final non-padding token for ForSequenceClassification models

#35911 opened Jan 27, 2025 by Rocketknight1

Loading…

Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters

#35898 opened Jan 26, 2025 by lenglaender

Loading…

1 of 5 tasks

Fix XGLM loss computation (PyTorch and TensorFlow)

#35878 opened Jan 24, 2025 by damianoamatruda

Loading…

Fix model kwargs

#35875 opened Jan 24, 2025 by muellerzr

Loading…

5 tasks

Make cache traceable

#35873 opened Jan 24, 2025 by IlyasMoutawwakil

Loading…

5 tasks

[docs] fix bugs in the bitsandbytes documentation

#35868 opened Jan 24, 2025 by faaany

Loading…

[docs] no hard coding cuda as bnb has multi-backend support

#35867 opened Jan 24, 2025 by faaany

Loading…

Fix device mismatch error in Whisper model during feature extraction

#35866 opened Jan 24, 2025 by thedebugger

Loading…

Update doc re list of models supporting TP

#35864 opened Jan 23, 2025 by kwen2501

Loading…

1 task done

Fix PaliGemma Pad Token Masking During Training #35855

#35859 opened Jan 23, 2025 by sambhavnoobcoder

Loading…

Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks Multimodal optimization

#35837 opened Jan 22, 2025 by li-plus

Loading…

1 of 5 tasks

Remove cache migration script

#35810 opened Jan 21, 2025 by Wauplin

Loading…

Remove head mask in generative models

#35786 opened Jan 20, 2025 by zucchini-nlp

Loading…

VLM: enable skipped tests

#35746 opened Jan 17, 2025 by zucchini-nlp

Loading…

Fix multi gpu loss sync condition, add doc and test

#35743 opened Jan 17, 2025 by techkang

Loading…

2 of 5 tasks

Make output_dir Optional in TrainingArguments #27866

#35735 opened Jan 16, 2025 by sambhavnoobcoder

Loading…

VLM: compile compatibility

#35724 opened Jan 16, 2025 by zucchini-nlp

Loading…

tests: revert change of torch_require_multi_gpu to be device agnostic

#35721 opened Jan 16, 2025 by dvrogozh

Loading…

Added support for GraniteForSequenceClassification

#35720 opened Jan 15, 2025 by berserkr

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-01-30.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly