Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dealing with multiple graphs / making compatible with current text2graph) #8

Merged
merged 942 commits into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
942 commits
Select commit Hold shift + click to select a range
07360b6
[`Llama2`] Add support for Llama 2 (#24891)
ArthurZucker Jul 18, 2023
a982c02
Disable ipex env var if false (#24885)
muellerzr Jul 18, 2023
476be08
Check for accelerate env var when doing CPU only (#24890)
muellerzr Jul 18, 2023
129cb6d
Avoid some pipeline tasks to use `use_cache=True` (#24893)
ydshieh Jul 19, 2023
c035970
Update tested versions in READMEs (#24895)
EliahKagan Jul 19, 2023
243b2ea
Fix `test_model_parallelism` for `FalconModel` (#24914)
ydshieh Jul 19, 2023
aa4afa6
Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#…
madhavajay Jul 19, 2023
99c1268
fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902)
21jun Jul 19, 2023
3a43794
Fix minor llama2.md model doc typos (#24909)
tmc Jul 19, 2023
ee4250a
[`Llama2`] replace `self.pretraining_tp` with `self.config.pretrainin…
younesbelkada Jul 19, 2023
6112b1c
[doc] `image_processing_vilt.py` wrong default documented (#24931)
stas00 Jul 19, 2023
7381987
🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korea…
jungnerd Jul 20, 2023
8fd8c8e
Add multi-label text classification support to pytorch example (#24770)
ranchlai Jul 20, 2023
79444f3
Deprecate unused OpenLlama architecture (#24922)
tomaarsen Jul 20, 2023
37d8611
replace no_cuda with use_cpu in test_pytorch_examples (#24944)
statelesshz Jul 20, 2023
89136ff
Generate: sequence bias can handle same terminations (#24822)
gante Jul 20, 2023
9859806
Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/de…
dependabot[bot] Jul 20, 2023
85514c1
Update processing_vision_text_dual_encoder.py (#24950)
premsa Jul 20, 2023
35c0459
Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916)
ydshieh Jul 20, 2023
0c41765
[DOCS] Example for `LogitsProcessor` class (#24848)
shauray8 Jul 20, 2023
e75cb0c
fix type annotations for arguments in training_args (#24550)
shauray8 Jul 20, 2023
9f912ef
Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decis…
dependabot[bot] Jul 20, 2023
89a1f34
[`RWKV`] Add Gradient Checkpointing support for RWKV (#24955)
younesbelkada Jul 20, 2023
aa1b09c
Change logic for logging in the examples (#24956)
muellerzr Jul 20, 2023
caf5e36
Contrastive Search peak memory reduction (#24120)
blbadger Jul 20, 2023
9ef5256
Fallback for missing attribute `Parameter.ds_numel` (#24942)
apoorvkh Jul 20, 2023
1c7e5e2
fix fsdp checkpointing issues (#24926)
pacman100 Jul 21, 2023
83f9314
fix: cast input pixels to appropriate dtype for image_to_text pipelin…
JimAllanson Jul 21, 2023
ec3dfe5
🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664)
wonhyeongseo Jul 21, 2023
f4eb459
fsdp fixes and enhancements (#24980)
pacman100 Jul 21, 2023
f74560d
Fix missing spaces in system prompt of Llama2 tokenizer (#24930)
chenjoya Jul 21, 2023
0511369
[`LlamaConfig`] Nit: pad token should be None by default (#24958)
ArthurZucker Jul 21, 2023
640e1b6
Remove tokenizers from the doc table (#24963)
sgugger Jul 21, 2023
5b7ffd5
Avoid importing all models when instantiating a pipeline (#24960)
sgugger Jul 21, 2023
a6484c8
Fix type annotation for deepspeed training arg (#24988)
sgugger Jul 21, 2023
a7d2131
Use main_input_name for include_inputs_for_metrics (#24993)
sgugger Jul 21, 2023
f1a1eb4
Fix `llama` tokenization doctest (#24990)
ydshieh Jul 21, 2023
d3ce048
[`bnb`] Add simple check for bnb import (#24995)
younesbelkada Jul 21, 2023
95f96b4
[`Llama`] remove persistent `inv_freq` tensor (#24998)
ArthurZucker Jul 21, 2023
ea41e18
improve from_pretrained for zero3 multi gpus mode (#24964)
1ytic Jul 21, 2023
87fba94
Move template doc file to md (#25004)
sgugger Jul 21, 2023
b257c46
🌐 [i18n-KO] Updated Korean `serialization.md` (#24686)
wonhyeongseo Jul 21, 2023
c9a82be
[check_config_docstrings.py] improve diagnostics (#25012)
stas00 Jul 24, 2023
0906d21
[`logging.py`] set default `stderr` path if `None` (#25033)
ArthurZucker Jul 24, 2023
54ba860
fix(integrations): store serialized `TrainingArgs` to `wandb.config` …
parambharat Jul 24, 2023
75317ae
[docs] Performance docs tidy up, part 1 (#23963)
MKhalusova Jul 24, 2023
6704923
Support GatedRepoError + use raise from (#25034)
Wauplin Jul 24, 2023
efb2ba6
Better handling missing SYS in llama conversation tokenizer (#24997)
ichernev Jul 24, 2023
383be1b
🌐[i18n-KO] Translated performance.md to Korean (#24883)
augustinLib Jul 24, 2023
9d2b983
🌐 [i18n-KO] Translated `testing.md` to Korean (#24900)
Sunmin0520 Jul 24, 2023
3b734f5
Add dispatch_batches to training arguments (#25038)
muellerzr Jul 24, 2023
8f1f0bf
Fix typo in LlamaTokenizerFast docstring example (#25018)
sbrunk Jul 24, 2023
42571f6
Make more test models smaller (#25005)
sgugger Jul 24, 2023
afe8bfc
Comment again print statement
sgugger Jul 24, 2023
a03d13c
Pvt model (#24720)
Xrenya Jul 24, 2023
3611fc9
compute_loss in trainer failing to label shift for PEFT model when la…
njbrake Jul 24, 2023
b08f41e
[`8bit`] Fix 8bit corner case with Blip2 8bit (#25047)
younesbelkada Jul 24, 2023
c0d1c33
🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911)
seank021 Jul 24, 2023
d229570
Better error message when signal is not supported on OS (#25049)
sgugger Jul 24, 2023
c53a6ea
[`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055)
ArthurZucker Jul 25, 2023
c0742b1
Generate - add beam indices output in contrained beam search (#25042)
gante Jul 25, 2023
faf25c0
[Docs] fix rope_scaling doc string (#25072)
kashif Jul 25, 2023
f6fe1d5
🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904)
54data Jul 25, 2023
ee1eb3b
🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966)
augustinLib Jul 25, 2023
f295fc8
Fix last models for common tests that are too big. (#25058)
sgugger Jul 25, 2023
5dba88b
fix: add TOC anchor link (#25066)
eenzeenee Jul 25, 2023
6bc61aa
Set `TF32` flag for PyTorch cuDNN backend (#25075)
XuehaiPan Jul 25, 2023
25e443c
Fix broken link in README_hd.md (#25067)
susnato Jul 25, 2023
c879318
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` i…
statelesshz Jul 25, 2023
f2c1df9
[`generate`] Only warn users if the `generation_config`'s `max_lengt…
ArthurZucker Jul 25, 2023
cb8abee
🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968)
harheem Jul 25, 2023
1dbc144
Fix: repeat per sample for SAM image embeddings (#25074)
xk-huang Jul 25, 2023
dcb183f
[`MPT`] Add MosaicML's `MPT` model to transformers (#24629)
ArthurZucker Jul 25, 2023
b99f7bd
[DOCS] add example NoBadWordsLogitsProcessor (#25046)
SoyGema Jul 25, 2023
b51312e
🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920)
junejae Jul 25, 2023
1e662f0
Allow generic composite models to pass more kwargs (#24927)
ydshieh Jul 25, 2023
f104522
[ `ForSequenceClassification`] Support `left` padding (#24979)
ArthurZucker Jul 25, 2023
2fac342
[`TF`] Also apply patch to support left padding (#25085)
ArthurZucker Jul 25, 2023
0779fc8
Edit err message and comment in `test_model_is_small` (#25087)
Jul 25, 2023
f9cc333
[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25…
ArthurZucker Jul 25, 2023
21150cb
Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091)
ydshieh Jul 25, 2023
8f36ab3
[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#…
sjrl Jul 25, 2023
da5ff18
Fix doctest (#25031)
ydshieh Jul 25, 2023
6b8dbc2
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projec…
dependabot[bot] Jul 25, 2023
45bde36
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projec…
dependabot[bot] Jul 25, 2023
f1deb21
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projec…
dependabot[bot] Jul 25, 2023
a5cc30d
fix tied_params for meta tensor (#25101)
SunMarc Jul 25, 2023
277d3ae
documentation for llama2 models (#25102)
shauray8 Jul 26, 2023
ee63520
🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828)
kihoon71 Jul 26, 2023
31acba5
Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106)
ydshieh Jul 26, 2023
04a5c85
Add descriptive docstring to TemperatureLogitsWarper (#24892)
nablabits Jul 26, 2023
c53c8e4
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is …
liucw2012 Jul 26, 2023
224da5d
update `use_auth_token` -> `token` (#25083)
ydshieh Jul 26, 2023
d30cf3d
Fix past CI after #24334 (#25113)
ydshieh Jul 26, 2023
1486d2a
Move common image processing methods to BaseImageProcessor (#25089)
amyeroberts Jul 26, 2023
b914ec9
Fix ViT docstring regarding default dropout values. (#25118)
ebezzam Jul 26, 2023
659829b
MaskFormer - enable return_dict in order to compile (#25052)
amyeroberts Jul 26, 2023
1689aea
Move center_crop to BaseImageProcessor (#25122)
amyeroberts Jul 26, 2023
a004237
fix deepspeed load best model at end when the model gets sharded (#25…
pacman100 Jul 27, 2023
de9e3b5
fix delete all checkpoints when save_total_limit is set to 1 (#25136)
Pbihao Jul 27, 2023
9429642
[`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#2…
ArthurZucker Jul 27, 2023
9a220ce
Clarify 4/8 bit loading log message (#25134)
BramVanroy Jul 27, 2023
a1c4954
🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109)
muellerzr Jul 27, 2023
9cea3e7
[`MptConfig`] support from pretrained args (#25116)
ArthurZucker Jul 27, 2023
0b92ae3
Add offload support to Bark (#25037)
ylacombe Jul 27, 2023
0c790dd
More `token` things (#25146)
ydshieh Jul 27, 2023
e931036
Add bloom flax (#25094)
sanchit-gandhi Jul 27, 2023
400e76e
Add new model in doc table of content (#25148)
sgugger Jul 27, 2023
6232c38
Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120)
Wauplin Jul 28, 2023
c1dba11
Add test when downloading from gated repo (#25039)
Wauplin Jul 28, 2023
2a78720
override .cuda() to check if model is already quantized (#25166)
ranchlai Jul 28, 2023
d23d2c2
Represent query_length in a different way to solve jit issue (#25164)
jiqing-feng Jul 28, 2023
afa96ff
make run_generation more generic for other devices (#25133)
statelesshz Jul 28, 2023
3cbc560
added compiled model support for inference (#25124)
Jul 28, 2023
d53b8ad
Update `use_auth_token` -> `token` in example scripts (#25167)
ydshieh Jul 28, 2023
add0895
[`Mpt`] Fix mpt slow test (#25170)
younesbelkada Jul 28, 2023
dd9d45b
[`InstructBlip`] Fix instructblip slow test (#25171)
younesbelkada Jul 28, 2023
31f137c
🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881)
sim-so Jul 28, 2023
c90e14f
Fix beam search to sample at least 1 non eos token (#25103) (#25115)
yonigottesman Jul 28, 2023
03f98f9
[MusicGen] Fix integration tests (#25169)
sanchit-gandhi Jul 28, 2023
05cda5d
🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174)
amyeroberts Jul 28, 2023
4a56449
Musicgen: CFG is manually added (#25173)
gante Jul 31, 2023
67b85f2
Better error message in `_prepare_output_docstrings` (#25202)
ydshieh Jul 31, 2023
59dcea3
[`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206)
younesbelkada Jul 31, 2023
9ca3aa0
Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211)
ydshieh Jul 31, 2023
5220606
[quantization.md] fix (#25190)
stas00 Jul 31, 2023
e0c50b2
[`pipeline`] revisit device check for pipeline (#25207)
younesbelkada Jul 31, 2023
1b4f619
Update tiny model info. and pipeline testing (#25213)
ydshieh Jul 31, 2023
0fd8d2a
Fix docker image build failure (#25214)
ydshieh Jul 31, 2023
4033ea7
make build_mpt_alibi_tensor a method of MptModel so that deepspeed co…
sywangyi Aug 1, 2023
77c3973
[`Pix2Struct`] Fix pix2struct cross attention (#25200)
younesbelkada Aug 1, 2023
972fdcc
[`Docs`/`quantization`] Clearer explanation on how things works under…
younesbelkada Aug 1, 2023
05ebb02
[`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201)
younesbelkada Aug 1, 2023
3170af7
[`Detr`] Fix detr BatchNorm replacement issue (#25230)
younesbelkada Aug 1, 2023
d27e4c1
Move rescale dtype recasting to match torchvision ToTensor (#25229)
amyeroberts Aug 1, 2023
f6f567d
Fix set of model parallel in the Trainer when no GPUs are available (…
sgugger Aug 2, 2023
2230d14
fix get_keys_to_not_convert() to return correct modules for full prec…
ranchlai Aug 2, 2023
c6a8768
add pathname and line number to logging formatter in debug mode (#25203)
ranchlai Aug 2, 2023
149cb0c
Add `token` arugment in example scripts (#25172)
ydshieh Aug 2, 2023
904e7e0
resolving zero3 init when using accelerate config with Trainer (#25227)
pacman100 Aug 2, 2023
1b35409
Update rescale tests - cast to float after rescaling to reflect #2522…
amyeroberts Aug 2, 2023
8021c68
Fix some bugs for two stage training of deformable detr (#25045)
jypjypjypjyp Aug 2, 2023
eec0d84
[DOCS] Add example and modified docs of EtaLogitsWarper (#25125)
agno-nymous Aug 2, 2023
1baeed5
Fix return_dict_in_generate bug in InstructBlip generate function (#2…
euanong Aug 2, 2023
8edd0da
Remove `pytest_options={"rA": None}` in CI (#25263)
ydshieh Aug 2, 2023
bef02fd
🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943)
heuristicwave Aug 2, 2023
ad83215
recommend DeepSpeed's Argument Parsing documentation (#25268)
BurnzZ Aug 2, 2023
b28ebb2
[MMS] Fix mms (#25267)
patrickvonplaten Aug 2, 2023
bd90cda
CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266)
ydshieh Aug 2, 2023
2bd7a27
CI with `pytest_num_workers=8` for torch/tf jobs (#25274)
ydshieh Aug 2, 2023
15082a9
Docs: Update list of `report_to` logging integrations in docstring (#…
tomaarsen Aug 3, 2023
30409af
Update InstructBLIP & Align values after rescale update (#25209)
amyeroberts Aug 3, 2023
a881737
Docs: separate generate section (#25235)
gante Aug 3, 2023
8455346
Update bark doc (#25234)
ylacombe Aug 3, 2023
6d3f9c1
add generate method to SpeechT5ForTextToSpeech (#25233)
ylacombe Aug 3, 2023
d114a6b
Add timeout parameter to load_image function (#25184)
rolisz Aug 3, 2023
66c240f
[JAX] Bump min version (#25286)
sanchit-gandhi Aug 3, 2023
33da2db
[small] llama2.md typo (#25295)
H-Huang Aug 3, 2023
641adca
Fix typo: Roberta -> RoBERTa (#25302)
MrGeislinger Aug 3, 2023
6768309
Move usage of deprecated logging.warn to logging.warning (#25310)
PeterJCLaw Aug 4, 2023
fab1a0a
Give more memory in test_disk_offload (#25315)
sgugger Aug 4, 2023
bff4313
Generate: get generation mode as an enum (#25292)
gante Aug 4, 2023
aeb5a08
Add offline mode for agents (#25226)
sgugger Aug 4, 2023
29f0400
Deal with nested configs better in base class (#25237)
sgugger Aug 4, 2023
f0fd73a
Document check copies (#25291)
sgugger Aug 4, 2023
ce6d153
Make `bark` could have tiny model (#25290)
ydshieh Aug 4, 2023
fdaef33
Document toc check and doctest check scripts (#25319)
sgugger Aug 4, 2023
fdd81ae
[Whisper] Better error message for outdated generation config (#25298)
sanchit-gandhi Aug 4, 2023
a6e6b1c
Remove jnp.DeviceArray since it is deprecated. (#24875)
mariecwhite Aug 4, 2023
d533465
add CFG for .generate() (#24654)
Vermeille Aug 6, 2023
b9da44b
🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978)
eenzeenee Aug 7, 2023
b0f2303
Update TF pin in docker image (#25343)
ydshieh Aug 7, 2023
d6bfba7
Generalize CFG to allow for positive prompts (#25339)
oobabooga Aug 7, 2023
65001cb
Loosen output shape restrictions on GPT-style models (#25188)
calpt Aug 7, 2023
1451093
Allow `trust_remote_code` in example scripts (#25248)
Jackmin801 Aug 7, 2023
7d65697
Generate: remove Marian hack (#25294)
gante Aug 7, 2023
c177606
Fix more offload edge cases (#25342)
ydshieh Aug 7, 2023
baf1daa
Migrate Trainer from `Repository` to `upload_folder` (#25095)
sgugger Aug 7, 2023
5fe3697
Adding more information in help parser on train_file and validation_f…
pphuc25 Aug 7, 2023
676247f
[DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcesso…
Rishab26 Aug 7, 2023
5ee9693
Docs: Added benchmarks for `torch.compile()` for vision models (#24748)
merveenoyan Aug 7, 2023
080a971
Add mask2former fp16 support (#25093)
pedrohml Aug 7, 2023
a23ac36
[DOCS] Add descriptive docstring to MinNewTokensLength (#25196)
nablabits Aug 8, 2023
d4bd33c
Register ModelOutput subclasses as supported torch.utils._pytree node…
ringohoffman Aug 8, 2023
6ea3ee3
Fix `test_model_parallelism` (#25359)
ydshieh Aug 8, 2023
5ea2595
Add warning for missing attention mask when pad tokens are detected (…
hackyon Aug 8, 2023
dedd111
[ASR Pipeline] Clarify return timestamps (#25344)
sanchit-gandhi Aug 8, 2023
36d5b8b
MaskFormer, Mask2Former - replace einsum for tracing (#25297)
amyeroberts Aug 8, 2023
01ab39b
Load state in else (#25318)
muellerzr Aug 8, 2023
5744482
Fix `token` in example template (#25351)
ydshieh Aug 8, 2023
26ce4dd
Enable tests to run on third-party devcies (#25327)
statelesshz Aug 8, 2023
6247d1b
🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017)
keonju2 Aug 8, 2023
9e57e0c
Fix `torch_job` worker(s) crashing (#25374)
ydshieh Aug 8, 2023
5bd8c01
Generate: add config-level validation (#25381)
gante Aug 8, 2023
9c7b744
Fix missing usage of `token` (#25382)
ydshieh Aug 8, 2023
5b517e1
Use small config for `OneFormerModelTest.test_model_with_labels` (#25…
ydshieh Aug 8, 2023
e349010
Add copied from for image processor methods (#25121)
amyeroberts Aug 8, 2023
3a05e01
change version (#25387)
SunMarc Aug 8, 2023
41c5f45
[DOCS] Add example for `TopPLogitsWarper` (#25361)
chiral-carbon Aug 8, 2023
1367142
🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923)
nuatmochoi Aug 9, 2023
1564a81
16059 - Add missing type hints for ASTModel (#25364)
nablabits Aug 9, 2023
85447bb
rm useless condition since the previous condition contains it. (#25403)
jiqing-feng Aug 9, 2023
5993771
Fix path for dynamic module creation (#25402)
sgugger Aug 9, 2023
ea5dda2
YOLOS - Revert default return_pixel_mask value (#25404)
amyeroberts Aug 9, 2023
d59b872
Docs: introduction to generation with LLMs (#25240)
gante Aug 9, 2023
3deed1f
Generate: length validation (#25384)
gante Aug 9, 2023
00b93cd
Improve training args (#25401)
statelesshz Aug 9, 2023
f456b4d
Generate: generation config validation fixes in docs (#25405)
gante Aug 9, 2023
ef74da6
16059 - Add extra type hints for AltCLIPModel (#25399)
nablabits Aug 9, 2023
eb3ded1
Generate: lower severity of parameterization checks (#25407)
gante Aug 9, 2023
f2a43c7
VQA task guide (#25244)
MKhalusova Aug 9, 2023
133aac0
🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957)
mjk0618 Aug 9, 2023
cf84738
🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625)
0525hhgus Aug 9, 2023
704bf59
Update Bark generation configs and tests (#25409)
ylacombe Aug 9, 2023
cb3c821
aligned sample_beam output selection with beam_search (#25375)
hukuda222 Aug 9, 2023
944ddce
Enable passing number of channels when inferring data format (#25412)
amyeroberts Aug 9, 2023
d0c1aeb
Bark: flexible generation config overload (#25414)
gante Aug 9, 2023
b175fc3
[DINOv2] Update pooler output (#25392)
NielsRogge Aug 10, 2023
b14d464
🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010)
TaeYupNoh Aug 10, 2023
16edf4d
Doc checks (#25408)
sgugger Aug 10, 2023
123ad53
Generation: strict generation config validation at save time (#25411)
gante Aug 10, 2023
d0839f1
[WavLM] Fix Arxiv link and authors (#25415)
sanchit-gandhi Aug 10, 2023
3e41cf1
Generate: Load generation config when `device_map` is passed (#25413)
gante Aug 10, 2023
e7b001d
Fix rendering for `torch.compile()` docs (#25432)
merveenoyan Aug 10, 2023
2d6839e
Add `examples` to tests to run when `setup.py` is modified (#25437)
ydshieh Aug 10, 2023
a7da299
Fix issue with ratio evaluation steps and auto find batch size (#25436)
muellerzr Aug 10, 2023
3470012
docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441)
statelesshz Aug 10, 2023
55db70c
GPTQ integration (#25062)
SunMarc Aug 10, 2023
454957c
Fix for #25437 (#25454)
ydshieh Aug 11, 2023
8db3720
not debugged code
zachares May 2, 2023
7fcb1ab
reference code so nothing is lost
zachares May 3, 2023
fe48c3f
novelty
zachares May 3, 2023
e47b6e8
added docstrings
zachares May 3, 2023
96d2814
fixed some relative import errors
zachares May 3, 2023
1090256
fixed small bugs
zachares May 3, 2023
5beaab7
added linear layers to bloom
zachares May 4, 2023
c7166f1
removed impossible embedding method
zachares May 4, 2023
d7052dd
Update src/transformers/models/bloom/desequence_graph_ids.py
zachares May 4, 2023
591cf9c
Update src/transformers/models/bloom/desequence_graph_ids.py
zachares May 4, 2023
9f156f0
feat(causal message passing) (#2)
zachares May 12, 2023
3c1ca82
Clearer code and simpler method for within LLM message passing (#3)
zachares May 12, 2023
13affe7
memory efficient message passing (#4)
zachares May 15, 2023
4eb3009
rebase from HF
vahanhov Aug 11, 2023
c5e232d
Merge branch 'main' into rebase-hf
zachares Aug 11, 2023
6f1bc64
dealing with multiple graphs
zachares Nov 16, 2023
8c47abc
Merge branch 'main' into multiple_graphs
zachares Nov 16, 2023
1c7d296
some testing / debugging
zachares Nov 17, 2023
a5c1f74
Update src/transformers/models/processing_graphs_within_model/causal_…
zachares Nov 17, 2023
8623c05
Update src/transformers/models/processing_graphs_within_model/causal_…
zachares Nov 17, 2023
6e48e39
Update src/transformers/models/processing_graphs_within_model/causal_…
zachares Nov 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/transformers/models/bloom/modeling_bloom.py
Original file line number Diff line number Diff line change
Expand Up @@ -682,7 +682,7 @@ def init_graph_information_passing(
""" Initializes a set of message passing layers to perform message passing of between
graph elements described in an input token id sequence
"""
assert element_type in ['nodes', 'edges'], 'unsupported message passing type'
assert element_type in ['node_correspondence', 'edge'], 'unsupported message passing type'
self.message_passing_type = element_type
self.graph_token_ids = graph_token_ids
self.num_gnn_layers = (
Expand Down Expand Up @@ -781,7 +781,7 @@ def forward(
extract_edge_sequence(t_ids.tolist(), self.graph_token_ids) for t_ids in input_ids
]
if self.message_passing_type == 'nodes':
get_matrices = GatedCausalMessagePassingLayer.build_node_information_passing
get_matrices = GatedCausalMessagePassingLayer.build_node_correspondence_information_passing
else:
get_matrices = GatedCausalMessagePassingLayer.build_edge_information_passing
message_passing_dicts = get_matrices(edge_sequences, self.device)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,15 @@ def forward(
element_embeddings = t_embeddings[message_passing_dict['tokens2elements']]
if message_passing_dict['edge_index'].numel() > 0:
element_embeddings = self.gnn_layer(
element_embeddings,
message_passing_dict['edge_index']
element_embeddings, message_passing_dict['edge_index']
)
new_t_embeddings[message_passing_dict['elements2tokens']] = element_embeddings
new_t_embeddings = t_embeddings + torch.tanh(self.gating_message_passing) * new_t_embeddings
new_token_embeddings.append(new_t_embeddings.unsqueeze(0))
return torch.cat(new_token_embeddings, dim=0)

@classmethod
def build_node_information_passing(
def build_node_correspondence_information_passing(
cls,
edge_sequences: List[List[Tuple[SequenceElement, Optional[SequenceElement], Optional[SequenceElement]]]],
device: torch.device
Expand All @@ -77,7 +76,7 @@ def build_node_information_passing(
message_passing_dicts.append(cls.to_torch(dict(message_passing_dict), device))
continue
add_node = partial(
cls.add_node,
cls.add_node_correspondence,
end_idx=cls.get_sequence_end(edge_sequence),
last_occurence_idx=defaultdict(lambda: -1),
message_passing_dict=message_passing_dict
Expand Down Expand Up @@ -149,7 +148,7 @@ def get_sequence_end(
return end_idx

@classmethod
def add_node(
def add_node_correspondence(
cls,
current_occurence: SequenceElement,
end_idx: int,
Expand All @@ -172,12 +171,13 @@ def add_node(
message_passing_dict=message_passing_dict
)
curr_length = len(message_passing_dict[f"tokens2elements"])
if last_occurence_idx[current_occurence.ids] != -1 and curr_length > prev_length:
full_ids = current_occurence.graph_id + ("--node_ids--",) + current_occurence.ids
if last_occurence_idx[full_ids] != -1 and curr_length > prev_length:
current_idx = len(message_passing_dict["tokens2elements"]) - 1
message_passing_dict['edge_index'].append(
[last_occurence_idx[current_occurence.ids], current_idx]
[last_occurence_idx[full_ids], current_idx]
)
last_occurence_idx[current_occurence.ids] = current_idx
last_occurence_idx[full_ids] = current_idx

@classmethod
def add_edge(
Expand All @@ -189,17 +189,19 @@ def add_edge(
):
""" Adds an edge as element to pass information between in a serialized graph """
pred_node, _, succ_node = sequenced_edge
prev_length = len(message_passing_dict[f"tokens2elements"])
prev_length = len(message_passing_dict["tokens2elements"])
cls.add_element_for_information_passing(
start_idx=succ_node.end_idx,
end_idx=end_idx,
message_passing_dict=message_passing_dict
)
curr_length = len(message_passing_dict[f"tokens2elements"])
curr_length = len(message_passing_dict["tokens2elements"])
if curr_length > prev_length:
current_idx = len(message_passing_dict["tokens2elements"]) - 1
node2edge_idxs[pred_node.ids].append(current_idx)
node2edge_idxs[succ_node.ids].append(current_idx)
pred_ids = pred_node.graph_id + ("--node_ids--",) + pred_node.ids
node2edge_idxs[pred_ids].append(current_idx)
succ_ids = succ_node.graph_id + succ_node.ids
zachares marked this conversation as resolved.
Show resolved Hide resolved
node2edge_idxs[succ_ids].append(current_idx)

@staticmethod
def add_element_for_information_passing(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ class SequenceElement:
end_idx: int
ids: Tuple[int]
length: int
graph_id: Tuple[int]


def extract_edge_sequence(
Expand Down Expand Up @@ -50,31 +51,65 @@ def _extract_graph_elements(
if none is found, returns an empty list
"""
sequence = []
prev_token_id, prev_idx, final_idx = None, -1, len(token_ids)
sog_idx, graph_id = None, None
prev_token_id, prev_idx, final_idx = None, -1, None
for token_idx, token_id in enumerate(token_ids):
if token_id == graph_tokens['pred_node'] and prev_token_id is None:
if (
token_id == graph_tokens['sog']
and prev_token_id is None
and sog_idx is None
):
sog_idx = token_idx
elif (
token_id == graph_tokens['pred_node']
and prev_token_id is None
and sog_idx is not None
):
graph_id = tuple(token_ids[sog_idx:token_idx])[1:]
prev_token_id, prev_idx = token_id, token_idx
elif (
token_id == graph_tokens['eog']
and prev_token_id is not None
and graph_id is not None
):
sequence.append(SequenceElement(
token=prev_token_id,
start_idx=prev_idx,
end_idx=token_idx,
ids=tuple(token_ids[prev_idx:token_idx])[1:],
length=token_idx - prev_idx,
graph_id=graph_id
))
sog_idx, graph_id = None, None
prev_token_id, prev_idx, final_idx = None, -1, len(token_ids)
elif (
token_id in [graph_tokens['pred_node'], graph_tokens['edge'], graph_tokens['succ_node']]
and prev_token_id is not None
and graph_id is not None
):
sequence.append(SequenceElement(
token=prev_token_id,
start_idx=prev_idx,
end_idx=token_idx,
ids=tuple(token_ids[prev_idx:token_idx])[1:],
length=token_idx - prev_idx
length=token_idx - prev_idx,
graph_id=graph_id
))
prev_token_id, prev_idx = token_id, token_idx
elif token_id in [graph_tokens['eos'], graph_tokens['pad']] and prev_token_id is not None:
elif (
token_id in [graph_tokens['eos'], graph_tokens['pad']]
and prev_token_id is not None
and graph_id is not None
):
final_idx = token_idx
break
if prev_token_id is not None:
if final_idx is not None:
sequence.append(SequenceElement(
token=prev_token_id,
start_idx=prev_idx,
end_idx=final_idx,
ids=tuple(token_ids[prev_idx:final_idx])[1:],
length=final_idx - prev_idx
length=final_idx - prev_idx,
graph_id=graph_id
))
return sequence
4 changes: 2 additions & 2 deletions src/transformers/models/t5/modeling_t5.py
Original file line number Diff line number Diff line change
Expand Up @@ -986,7 +986,7 @@ def init_graph_information_passing(
""" Initializes a set of message passing layers to perform message passing of between
graph elements described in an input token id sequence
"""
assert element_type in ['nodes', 'edges'], 'unsupported message passing type'
assert element_type in ['node_correspondence', 'edge'], 'unsupported message passing type'
self.message_passing_type = element_type
self.graph_token_ids = graph_token_ids
self.num_gnn_layers = (
Expand Down Expand Up @@ -1105,7 +1105,7 @@ def forward(
extract_edge_sequence(t_ids.tolist(), self.graph_token_ids) for t_ids in input_ids
]
if self.message_passing_type == 'nodes':
get_matrices = GatedCausalMessagePassingLayer.build_node_information_passing
get_matrices = GatedCausalMessagePassingLayer.build_node_correspondence_information_passing
else:
get_matrices = GatedCausalMessagePassingLayer.build_edge_information_passing
message_passing_dicts = get_matrices(edge_sequences, self.device)
Expand Down
Loading