Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: OLMoForCausalLM does not support Flash Attention 2.0 yet #29145 #1

Merged
merged 3,084 commits into from
Feb 20, 2024
Merged
Changes from 1 commit
Commits
Show all changes
3084 commits
Select commit Hold shift + click to select a range
d6ffe74
Add qwen2 (#28436)
JustinLin610 Jan 17, 2024
2c1eebc
Fix SDPA tests (#28552)
fxmarty Jan 17, 2024
fa6d12f
Allow to train dinov2 with different dtypes like bf16 (#28504)
StarCycle Jan 17, 2024
98dda8e
Fix Switch Transformers When sparse_step = 1 (#28564)
agemagician Jan 17, 2024
3005f96
Save `Processor` (#27761)
ydshieh Jan 18, 2024
a1668cc
Use `weights_only` only if torch >= 1.13 (#28506)
ydshieh Jan 18, 2024
8189977
[`Core Tokenization`] Support a fix for spm fast models (#26678)
ArthurZucker Jan 18, 2024
5d8eb93
chore: Fix multiple typos (#28574)
hugo-syn Jan 18, 2024
d2cdefb
Add new meta w2v2-conformer BERT-like model (#28165)
ylacombe Jan 18, 2024
0754217
Use `LoggingLevel` context manager in 3 tests (#28575)
ydshieh Jan 18, 2024
c662c78
Fix the documentation checkpoint for xlm-roberta-xl (#28567)
jeremyfowers Jan 18, 2024
0eaa5ea
[ASR Pipe] Update init to set model type and subsequently call parent…
sanchit-gandhi Jan 18, 2024
619ecfe
[Whisper Tok] Move token ids to CPU when computing offsets (#28485)
sanchit-gandhi Jan 18, 2024
186aa6b
[Whisper] Fix audio classification with weighted layer sum (#28563)
sanchit-gandhi Jan 18, 2024
772307b
Making CTC training example more general (#28582)
ylacombe Jan 18, 2024
db9a7e9
Don't save `processor_config.json` if a processor has no extra attrib…
ydshieh Jan 19, 2024
b2748a6
v4.38.dev.0
amyeroberts Jan 19, 2024
268fc1f
Add w2v2bert to pipeline (#28585)
ylacombe Jan 19, 2024
d4fc1eb
feat: Sequential beam search (#26304)
Saibo-creator Jan 19, 2024
690fe73
[Whisper] Finalize batched SOTA long-form generation (#27658)
patrickvonplaten Jan 19, 2024
8db6436
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386)
faaany Jan 19, 2024
faf0354
[SigLIP] Don't pad by default (#28578)
NielsRogge Jan 19, 2024
5b7f4bc
[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570)
isaac-vidas Jan 19, 2024
d157815
Allow add_tokens for ESM (#28535)
Rocketknight1 Jan 19, 2024
9efec11
Fix `_speculative_sampling` implementation (#28508)
ofirzaf Jan 19, 2024
948ffff
RWKV: raise informative exception when attempting to manipulate `past…
gante Jan 19, 2024
3f69f41
Fix auxiliary loss related code in transformers (#28406)
SangbumChoi Jan 19, 2024
83f9196
[`GPTNeoX`] Fix BC issue with 4.36 (#28602)
ArthurZucker Jan 21, 2024
f0acf7b
Fix id2label assignment in run_classification.py (#28590)
jheitmann Jan 22, 2024
bf67415
Add missing key to TFLayoutLM signature (#28640)
Rocketknight1 Jan 22, 2024
d336c56
Avoid root logger's level being changed (#28638)
ydshieh Jan 22, 2024
692c3c6
Add config tip to custom model docs (#28601)
Rocketknight1 Jan 22, 2024
deb2b59
Fix lr_scheduler in no_trainer training scripts (#27872)
bofenghuang Jan 22, 2024
dafd595
[`Llava`] Update convert_llava_weights_to_hf.py script (#28617)
isaac-vidas Jan 22, 2024
e201864
[`GPTNeoX`] Fix GPTNeoX + Flash Attention 2 issue (#28645)
younesbelkada Jan 22, 2024
a35ea57
Update image_processing_deformable_detr.py (#28561)
sounakdey Jan 22, 2024
590be77
[`SigLIP`] Only import tokenizer if sentencepiece available (#28636)
amyeroberts Jan 22, 2024
e547458
Fix phi model doc checkpoint (#28581)
amyeroberts Jan 22, 2024
1fc1296
get default device through `PartialState().default_device` as it has …
ji-huazhong Jan 23, 2024
0398660
integrations: fix DVCLiveCallback model logging (#28653)
Jan 23, 2024
008a6a2
Enable safetensors conversion from PyTorch to other frameworks withou…
LysandreJik Jan 23, 2024
27c79a0
Enable instantiating model with pretrained backbone weights (#28214)
amyeroberts Jan 23, 2024
c475eca
`tensor_size` - fix copy/paste error msg typo (#28660)
scruel Jan 23, 2024
582d104
Fix windows err with checkpoint race conditions (#28637)
muellerzr Jan 23, 2024
5b5e71d
add dataloader prefetch factor in training args and trainer (#28498)
qmeeus Jan 23, 2024
9a4521d
Support single token decode for `CodeGenTokenizer` (#28628)
cmathw Jan 23, 2024
ebc8f47
Remove deprecated eager_serving fn (#28665)
Rocketknight1 Jan 23, 2024
39c3c0a
fix a hidden bug of `GenerationConfig`, now the `generation_config.js…
ParadoxZW Jan 23, 2024
5f81266
Update README_es.md (#28612)
vladydev3 Jan 23, 2024
c5c6909
Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#…
khaimt Jan 24, 2024
0549000
Use save_safetensor to disable safe serialization for XLA (#28669)
jeffhataws Jan 24, 2024
bb6aa8b
Add back in generation types (#28681)
amyeroberts Jan 24, 2024
738ec75
[docs] DeepSpeed (#28542)
stevhliu Jan 24, 2024
5d29530
Improved type hinting for all attention parameters (#28479)
nakranivaibhav Jan 24, 2024
8278b15
improve efficient training on CPU documentation (#28646)
faaany Jan 24, 2024
f40b87d
[docs] Fix doc format (#28684)
stevhliu Jan 24, 2024
963db81
Add Depth Anything (#28654)
NielsRogge Jan 25, 2024
7fa4b36
[`chore`] Add missing space in warning (#28695)
tomaarsen Jan 25, 2024
2000095
Improve Backbone API docs (#28666)
merveenoyan Jan 25, 2024
24f1a00
Update question_answering.md (#28694)
Jan 25, 2024
4cbd876
[`Vilt`] align input and model dtype in the ViltPatchEmbeddings forwa…
faaany Jan 25, 2024
2875195
[`docs`] Improve visualization for vertical parallelism (#28583)
petergtz Jan 25, 2024
142ce68
Don't fail when `LocalEntryNotFoundError` during `processor_config.js…
ydshieh Jan 26, 2024
8eb74c1
Fix duplicate & unnecessary flash attention warnings (#28557)
fxmarty Jan 26, 2024
bbe30c6
support PeftMixedModel signature inspect (#28321)
Facico Jan 26, 2024
1f47a24
fix: corrected misleading log message in save_pretrained function (#2…
mturetskii Jan 26, 2024
3a46e30
[`docs`] Update preprocessing.md (#28719)
velaia Jan 26, 2024
d6ac8f4
Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(…
ShukantPal Jan 26, 2024
a638de1
Fix `weights_only` (#28725)
ydshieh Jan 26, 2024
708b19e
Stop confusing the TF compiler with ModelOutput objects (#28712)
Rocketknight1 Jan 26, 2024
3aea38c
fix: suppress `GatedRepoError` to use cache file (fix #28558). (#28566)
scruel Jan 26, 2024
f8b7c43
Unpin pydantic (#28728)
ydshieh Jan 26, 2024
abe0289
[docs] Fix datasets in guides (#28715)
stevhliu Jan 26, 2024
de13a95
[Flax] Update no init test for Flax v0.7.1 (#28735)
sanchit-gandhi Jan 26, 2024
a28a769
Falcon: removed unused function (#28605)
gante Jan 27, 2024
03cc177
Generate: deprecate old src imports (#28607)
gante Jan 27, 2024
f1cc615
[`Siglip`] protect from imports if sentencepiece not installed (#28737)
amyeroberts Jan 28, 2024
243e186
Add serialization logic to pytree types (#27871)
angelayi Jan 29, 2024
5649c0c
Fix `DepthEstimationPipeline`'s docstring (#28733)
ydshieh Jan 29, 2024
39fa400
Fix input data file extension in examples (#28741)
khipp Jan 29, 2024
3a08cc4
[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD …
Vinyzu Jan 29, 2024
f72c7c2
PatchtTST and PatchTSMixer fixes (#28083)
wgifford Jan 29, 2024
0548af5
Enable Gradient Checkpointing in Deformable DETR (#28686)
FoamoftheSea Jan 29, 2024
26aa03a
small doc update for CamemBERT (#28644)
julien-c Jan 29, 2024
0f8d015
Pin pytest version <8.0.0 (#28758)
amyeroberts Jan 29, 2024
9e8f35f
Mark test_constrained_beam_search_generate as flaky (#28757)
amyeroberts Jan 29, 2024
e694e98
Fix typo of `Block`. (#28727)
xkszltl Jan 29, 2024
da3c79b
[Whisper] Make tokenizer normalization public (#28136)
sanchit-gandhi Jan 29, 2024
a055d09
Support saving only PEFT adapter in checkpoints when using PEFT + FSD…
AjayP13 Jan 29, 2024
cd2eb8c
Add French translation: french README.md (#28696)
ThibaultLengagne Jan 29, 2024
a989c6c
Don't allow passing `load_in_8bit` and `load_in_4bit` at the same tim…
osanseviero Jan 30, 2024
1f5590d
Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841)
lz1oceani Jan 30, 2024
d78e78a
`HfQuantizer` class for quantization-related stuff in `modeling_utils…
poedator Jan 30, 2024
866253f
[`HfQuantizer`] Move it to "Developper guides" (#28768)
younesbelkada Jan 30, 2024
5c8d941
Use Conv1d for TDNN (#25728)
gau-nernst Jan 30, 2024
6f7d5db
Fix transformers.utils.fx compatibility with torch<2.0 (#28774)
fxmarty Jan 30, 2024
c24c524
Further pin pytest version (in a temporary way) (#28780)
ydshieh Jan 30, 2024
2fa1c80
[`Backbone`] Use `load_backbone` instead of `AutoBackbone.from_config…
amyeroberts Jan 30, 2024
1d489b3
Task-specific pipeline init args (#28439)
amyeroberts Jan 30, 2024
415e9a0
Add tf_keras imports to prepare for Keras 3 (#28588)
Rocketknight1 Jan 30, 2024
74c9cfe
Pin Torch to <2.2.0 (#28785)
Rocketknight1 Jan 30, 2024
d703eaa
[`bnb`] Fix bnb slow tests (#28788)
younesbelkada Jan 31, 2024
a937425
Prevent MLflow exception from disrupting training (#28779)
codiceSpaghetti Jan 31, 2024
ae0c27a
don't initialize the output embeddings if we're going to tie them to …
tom-p-reichel Jan 31, 2024
f9f1f2a
[`HFQuantizer`] Remove `check_packages_compatibility` logic (#28789)
younesbelkada Jan 31, 2024
65a926e
[Whisper] Refactor forced_decoder_ids & prompt ids (#28687)
patrickvonplaten Jan 31, 2024
bebeeee
Resolve DeepSpeed cannot resume training with PeftModel (#28746)
lh0x00 Jan 31, 2024
721e2d9
canonical repos moves (#28795)
julien-c Jan 31, 2024
7a49610
Wrap Keras methods to support BatchEncoding (#28734)
Rocketknight1 Jan 31, 2024
f7076cd
Flax mistral (#26943)
kiansierra Jan 31, 2024
beb2a09
DeepSpeed: hardcode `torch.arange` dtype on `float` usage to avoid in…
gante Jan 31, 2024
95346e9
Add artifact name in job step to maintain job / artifact corresponden…
ydshieh Jan 31, 2024
4735866
Split daily CI using 2 level matrix (#28773)
ydshieh Jan 31, 2024
7b2bd1f
[docs] Correct the statement in the docstirng of compute_transition_s…
Ki-Seki Jan 31, 2024
0d26abd
Adding [T5/MT5/UMT5]ForTokenClassification (#28443)
hackyon Feb 1, 2024
eb8e7a0
Make `is_torch_bf16_available_on_device` more strict (#28796)
ydshieh Feb 1, 2024
709dc43
Fix symbolic_trace with kv cache (#28724)
fxmarty Feb 1, 2024
7bc6d76
Add tip on setting tokenizer attributes (#28764)
Rocketknight1 Feb 1, 2024
e19c12e
enable graident checkpointing in DetaObjectDetection and add tests in…
SangbumChoi Feb 1, 2024
d98591a
[docs] fix some bugs about parameter description (#28806)
zspo Feb 1, 2024
23ea674
Add models from deit (#28302)
rajveer43 Feb 1, 2024
abbffc4
[docs] Backbone (#28739)
stevhliu Feb 1, 2024
2418c64
[docs] HfQuantizer (#28820)
stevhliu Feb 2, 2024
721ee78
[Docs] Fix spelling and grammar mistakes (#28825)
khipp Feb 2, 2024
1efb21c
Explicitly check if token ID's are None in TFBertTokenizer constructo…
skumar951 Feb 2, 2024
ec29d25
Add missing None check for hf_quantizer (#28804)
jganitkevitch Feb 2, 2024
0e75aee
Fix issues caused by natten (#28834)
ydshieh Feb 2, 2024
a7cb92a
fix / skip (for now) some tests before switch to torch 2.2 (#28838)
ydshieh Feb 2, 2024
f497795
Use `-v` for `pytest` on CircleCI (#28840)
ydshieh Feb 2, 2024
80d5007
Reduce GPU memory usage when using FSDP+PEFT (#28830)
pacman100 Feb 2, 2024
3d2900e
Mark `test_encoder_decoder_model_generate` for `vision_encoder_deocde…
amyeroberts Feb 2, 2024
ca8944c
Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decisio…
dependabot[bot] Feb 5, 2024
7b70283
Support custom scheduler in deepspeed training (#26831)
VeryLazyBoy Feb 5, 2024
c430d6e
[Docs] Fix bad doc: replace save with logging (#28855)
chenzizhao Feb 5, 2024
0466fd5
Ability to override clean_code_for_run (#28783)
w4ffl35 Feb 5, 2024
2da28c4
[WIP] Hard error when ignoring tensors. (#27484)
Narsil Feb 5, 2024
3f9f749
[`Doc`] update contribution guidelines (#28858)
ArthurZucker Feb 5, 2024
7addc93
Correct wav2vec2-bert inputs_to_logits_ratio (#28821)
ylacombe Feb 5, 2024
ba3264b
Image Feature Extraction pipeline (#28216)
amyeroberts Feb 5, 2024
0690116
ClearMLCallback enhancements: support multiple runs and handle loggin…
eugen-ajechiloae-clearml Feb 5, 2024
ac51e59
Do not use mtime for checkpoint rotation. (#28862)
xkszltl Feb 6, 2024
2e7c942
Adds LlamaForQuestionAnswering class in modeling_llama.py along with …
nakranivaibhav Feb 6, 2024
e83227d
Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_project…
dependabot[bot] Feb 6, 2024
1ea0bbd
[Docs] Update project names and links in awesome-transformers (#28878)
khipp Feb 6, 2024
ee2a340
Fix LongT5ForConditionalGeneration initialization of lm_head (#28873)
eranhirs Feb 6, 2024
5346db1
Raise error when using `save_only_model` with `load_best_model_at_end…
pacman100 Feb 6, 2024
6529a5b
Fix `FastSpeech2ConformerModelTest` and skip it on CPU (#28888)
ydshieh Feb 6, 2024
76b4f66
Revert "[WIP] Hard error when ignoring tensors." (#28898)
ydshieh Feb 6, 2024
89439fe
unpin torch (#28892)
ydshieh Feb 6, 2024
a1afec9
Explicit server error on gated model (#28894)
Wauplin Feb 6, 2024
4830f26
[Docs] Fix backticks in inline code and documentation links (#28875)
khipp Feb 6, 2024
40658be
Hotfix - make `torchaudio` get the correct version in `torch_and_flax…
ydshieh Feb 6, 2024
1c31b7a
[Docs] Add missing language options and fix broken links (#28852)
khipp Feb 6, 2024
64d1518
fix: Fixed the documentation for `logging_first_step` by removing "ev…
Sai-Suraj-27 Feb 7, 2024
d9deddb
fix Starcoder FA2 implementation (#28891)
pacman100 Feb 7, 2024
349a6e8
Fix Keras scheduler import so it works for older versions of Keras (#…
Rocketknight1 Feb 7, 2024
abf8f54
⚠️ Raise `Exception` when trying to generate 0 tokens ⚠️ (#28621)
danielkorat Feb 7, 2024
308d2b9
Update the cache number (#28905)
ydshieh Feb 7, 2024
5f96855
Add npu device for pipeline (#28885)
ji-huazhong Feb 7, 2024
328ade8
[Docs] Fix placement of tilde character (#28913)
khipp Feb 8, 2024
33df036
[Docs] Revert translation of '@slow' decorator (#28912)
khipp Feb 8, 2024
4b236ae
Fix utf-8 yaml load for marian conversion to pytorch in Windows (#28618)
SystemPanic Feb 8, 2024
115ac94
[`Core generation`] Adds support for static KV cache (#27931)
ArthurZucker Feb 8, 2024
693667b
Remove dead TF loading code (#28926)
Rocketknight1 Feb 8, 2024
0b693e9
fix: torch.int32 instead of torch.torch.int32 (#28883)
vodkaslime Feb 8, 2024
cc309fd
pass kwargs in stopping criteria list (#28927)
zucchini-nlp Feb 8, 2024
d628664
Support batched input for decoder start ids (#28887)
zucchini-nlp Feb 8, 2024
2749e47
[Docs] Fix broken links and syntax issues (#28918)
khipp Feb 8, 2024
de11e65
Fix max_position_embeddings default value for llama2 to 4096 #28241 (…
karl-hajjar Feb 9, 2024
ebf3ea2
Fix a wrong link to CONTRIBUTING.md section in PR template (#28941)
B-Step62 Feb 9, 2024
d123e66
Fix type annotations on neftune_noise_alpha and fsdp_config TrainingA…
peblair Feb 9, 2024
58e3d23
[i18n-de] Translate README.md to German (#28933)
khipp Feb 9, 2024
f278ef2
[Nougat] Fix pipeline (#28242)
NielsRogge Feb 12, 2024
ef5ab72
[Docs] Update README and default pipelines (#28864)
NielsRogge Feb 12, 2024
cf4c20b
Convert `torch_dtype` as `str` to actual torch data type (i.e. "float…
KossaiSbai Feb 12, 2024
1709886
[`pipelines`] updated docstring with vqa alias (#28951)
cmahmut Feb 12, 2024
e30bbb2
Tests: tag `test_save_load_fast_init_from_base` as flaky (#28930)
gante Feb 12, 2024
792819f
Updated requirements for image-classification samples: datasets>=2.14…
alekseyfa Feb 12, 2024
136cd89
Always initialize tied output_embeddings if it has a bias term (#28947)
hackyon Feb 12, 2024
c617f98
Clean up staging tmp checkpoint directory (#28848)
woshiyyya Feb 12, 2024
fe3df9d
[Docs] Add language identifiers to fenced code blocks (#28955)
khipp Feb 12, 2024
78ba9f4
[Docs] Add video section (#28958)
NielsRogge Feb 12, 2024
d90acc1
[i18n-de] Translate CONTRIBUTING.md to German (#28954)
khipp Feb 12, 2024
b445675
[`NllbTokenizer`] refactor with added tokens decoder (#27717)
ArthurZucker Feb 13, 2024
da20209
Add sudachi_projection option to BertJapaneseTokenizer (#28503)
hiroshi-matsuda-rit Feb 13, 2024
3e70a20
Static Cache: load models with MQA or GQA (#28975)
gante Feb 13, 2024
3de6a6b
Update configuration_llama.py: fixed broken link (#28946)
AdityaKane2001 Feb 13, 2024
bd4b83e
[`DETR`] Update the processing to adapt masks & bboxes to reflect pad…
amyeroberts Feb 13, 2024
1d12b8b
ENH: Do not pass warning message in case `quantization_config` is in …
younesbelkada Feb 14, 2024
164bdef
ENH [`AutoQuantizer`]: enhance trainer + not supported quant methods …
younesbelkada Feb 14, 2024
de6029a
Add `StableLM` (#28810)
jon-tow Feb 14, 2024
63ffd56
Add SiglipForImageClassification and CLIPForImageClassification (#28952)
NielsRogge Feb 14, 2024
1ecf5f7
AQLM quantizer support (#28928)
Feb 14, 2024
7252e8d
[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixi…
amyeroberts Feb 14, 2024
69ca640
Set the dataset format used by `test_trainer` to float32 (#28920)
ji-huazhong Feb 14, 2024
0507e69
Introduce AcceleratorConfig dataclass (#28664)
muellerzr Feb 14, 2024
354775b
Fix flaky test vision encoder-decoder generate (#28923)
zucchini-nlp Feb 14, 2024
3f4e79d
Mask Generation Task Guide (#28897)
merveenoyan Feb 14, 2024
725f4ad
Add tie_weights() to LM heads and set bias in set_output_embeddings()…
hackyon Feb 14, 2024
0199a48
Backbone kwargs in config (#28784)
amyeroberts Feb 14, 2024
5f06053
[TPU] Support PyTorch/XLA FSDP via SPMD (#28949)
alanwaketan Feb 14, 2024
7a0fccc
FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"t…
younesbelkada Feb 14, 2024
609a176
[`CLeanup`] Revert SDPA attention changes that got in the static kv c…
ArthurZucker Feb 14, 2024
f3788b0
Fix static generation when compiling! (#28937)
ArthurZucker Feb 15, 2024
83e96dc
Add cuda_custom_kernel in DETA (#28989)
SangbumChoi Feb 15, 2024
5b6fa23
DeformableDetrModel support fp16 (#29013)
DonggeunYu Feb 15, 2024
8a0ed0a
Fix copies between DETR and DETA (#29037)
amyeroberts Feb 15, 2024
6d1f545
FIX: Fix error with `logger.warning` + inline with recent refactor (#…
younesbelkada Feb 15, 2024
4156f51
Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)
amyeroberts Feb 15, 2024
b0a7f44
Removed obsolete attribute setting for AQLM quantization. (#29034)
Feb 15, 2024
f3aa7db
Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput…
sadra-barikbin Feb 15, 2024
1e402b9
add test marker to run all tests with @require_bitsandbytes (#28278)
Titus-von-Koeller Feb 16, 2024
f497f56
Update all references to canonical models (#29001)
LysandreJik Feb 16, 2024
8876ce8
Update important model list (#29019)
LysandreJik Feb 16, 2024
aee11fe
Fix max_length criteria when using inputs_embeds (#28994)
zucchini-nlp Feb 16, 2024
0eb4085
Support : Leverage Accelerate for object detection/segmentation model…
Tanmaypatil123 Feb 16, 2024
258da40
fix num_assistant_tokens with heuristic schedule (#28759)
jmamou Feb 16, 2024
b262808
fix failing trainer ds tests (#29057)
pacman100 Feb 16, 2024
4c18ddb
`auto_find_batch_size` isn't yet supported with DeepSpeed/FSDP. Raise…
pacman100 Feb 16, 2024
be42c24
Honor trust_remote_code for custom tokenizers (#28854)
rl337 Feb 16, 2024
161fe42
Feature: Option to set the tracking URI for MLflowCallback. (#29032)
seanswyi Feb 16, 2024
636b032
Fix trainer test wrt DeepSpeed + auto_find_bs (#29061)
muellerzr Feb 16, 2024
2f1003b
Add chat support to text generation pipeline (#28945)
Rocketknight1 Feb 16, 2024
ce4fff0
[Docs] Spanish translation of task_summary.md (#28844)
aaronjimv Feb 16, 2024
864c8e6
[`Awq`] Add peft support for AWQ (#28987)
younesbelkada Feb 19, 2024
a75a6c9
FIX [`bnb` / `tests`]: Fix currently failing bnb tests (#29092)
younesbelkada Feb 19, 2024
593230f
fix the post-processing link (#29091)
davies-w Feb 19, 2024
9830858
Fix the `bert-base-cased` tokenizer configuration test (#29105)
LysandreJik Feb 19, 2024
79132d4
Fix a typo in `examples/pytorch/text-classification/run_classificatio…
Ja1Zhou Feb 19, 2024
b2724d7
change version (#29097)
ArthurZucker Feb 19, 2024
07e3454
[Docs] Add resources (#28705)
NielsRogge Feb 19, 2024
08cd694
ENH: added new output_logits option to generate function (#28667)
mbaak Feb 19, 2024
5ce90f3
Bnb test fix for different hardwares (#29066)
Titus-von-Koeller Feb 19, 2024
a4851d9
Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_paramet…
sadra-barikbin Feb 19, 2024
4f09d0f
storing & logging gradient norm in trainer (#27326)
shijie-wu Feb 19, 2024
49c0b29
Fixed nll with label_smoothing to just nll (#28708)
nileshkokane01 Feb 20, 2024
9094abe
[`gradient_checkpointing`] default to use it for torch 2.3 (#28538)
ArthurZucker Feb 20, 2024
a7ff2f2
Move misplaced line (#29117)
kno10 Feb 20, 2024
f7ef7ce
FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trai…
younesbelkada Feb 20, 2024
1c9134f
Abstract image processor arg checks. (#28843)
molbap Feb 20, 2024
ff76e7c
FIX [`bnb` / `tests`] Propagate the changes from #29092 to 4-bit test…
younesbelkada Feb 20, 2024
7d312ad
Llama: fix batched generation (#29109)
gante Feb 20, 2024
a7755d2
Generate: unset GenerationConfig parameters do not raise warning (#29…
gante Feb 20, 2024
5e95dca
[`cuda kernels`] only compile them when initializing (#29133)
ArthurZucker Feb 20, 2024
efdd436
FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled mod…
younesbelkada Feb 20, 2024
15cfe38
[`Core tokenization`] `add_dummy_prefix_space` option to help with l…
ArthurZucker Feb 20, 2024
0996a10
Revert low cpu mem tie weights (#29135)
amyeroberts Feb 20, 2024
ee3af60
Add support for fine-tuning CLIP-like models using contrastive-image-…
tjs-intel Feb 20, 2024
7688d8d
Save (circleci) cache at the end of a job (#29141)
ydshieh Feb 20, 2024
b8b1647
[Phi] Add support for sdpa (#29108)
hackyon Feb 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[Flax] Update no init test for Flax v0.7.1 (huggingface#28735)
  • Loading branch information
sanchit-gandhi authored Jan 26, 2024
commit de13a951b38b85195984164819f1ab05fe508677
2 changes: 1 addition & 1 deletion tests/test_modeling_flax_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -984,7 +984,7 @@ def test_no_automatic_init(self):

# Check if we params can be properly initialized when calling init_weights
params = model.init_weights(model.key, model.input_shape)
self.assertIsInstance(params, FrozenDict)
assert isinstance(params, (dict, FrozenDict)), f"params are not an instance of {FrozenDict}"
# Check if all required parmas are initialized
keys = set(flatten_dict(unfreeze(params)).keys())
self.assertTrue(all(k in keys for k in model.required_params))
Expand Down