Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Idefics2 #30253

Merged
merged 131 commits into from
Apr 15, 2024
Merged
Changes from 1 commit
Commits
Show all changes
131 commits
Select commit Hold shift + click to select a range
e536f6a
Merge pull request #9 from huggingface/update
molbap Mar 4, 2024
ef8c0fb
Merge branch 'main' of github.com:huggingface/new-model-addition
ArthurZucker Mar 30, 2024
79277fe
Initial add model additions
amyeroberts Feb 26, 2024
6c89a99
Test
amyeroberts Feb 26, 2024
2e1155b
All weights loading
amyeroberts Feb 27, 2024
661b794
Can perform full forward pass
amyeroberts Feb 27, 2024
dd0a3d2
Local and remote the same
amyeroberts Feb 27, 2024
d124863
Matching local and remote
amyeroberts Mar 1, 2024
7aff0b7
Fixup
amyeroberts Mar 1, 2024
799dc71
Idefics2Model importable; fixup docstrings
amyeroberts Mar 1, 2024
dbebbb1
Don't skip by default
amyeroberts Mar 1, 2024
465e3ed
Remove deprecated use_resampler arg
amyeroberts Mar 1, 2024
ae5b94d
Remove self.config
amyeroberts Mar 1, 2024
7983e93
DecoupledLinear takes config
amyeroberts Mar 1, 2024
0a00064
Tidy up
amyeroberts Mar 1, 2024
6e4ff1b
Enable eager attention and tidy up
amyeroberts Mar 1, 2024
1aa8f7a
Most tests passing
amyeroberts Mar 1, 2024
ea4bf34
Update for batch of processed images
amyeroberts Mar 4, 2024
b6a92da
Add image processor
amyeroberts Mar 4, 2024
0d09b95
Update doc pages
amyeroberts Mar 4, 2024
3c11158
Update conversion script
amyeroberts Mar 4, 2024
c6d4559
Remove erroneous breakpoint
amyeroberts Mar 5, 2024
c6275e9
Remove accidendtal spelling change
amyeroberts Mar 5, 2024
5dd0071
Update to reflect changes on hub - make generate work
amyeroberts Mar 5, 2024
015356b
Fix up
amyeroberts Mar 5, 2024
8c50169
Image processor tests
amyeroberts Mar 5, 2024
da389b8
Update tests
amyeroberts Mar 5, 2024
e8b131d
Add a processor
amyeroberts Mar 5, 2024
2fc3ff3
Add a processor
amyeroberts Mar 6, 2024
e06740c
Update convert script
amyeroberts Mar 6, 2024
083e82b
Update modeling file - remove fixmes
amyeroberts Mar 6, 2024
256fa30
Bug fix
amyeroberts Mar 7, 2024
0fd5400
Add processing test
amyeroberts Mar 7, 2024
f537f27
Use processor
amyeroberts Mar 7, 2024
d14485a
Fix up
amyeroberts Mar 7, 2024
02371e9
Update src/transformers/models/idefics2/modeling_idefics2.py
amyeroberts Mar 11, 2024
7fba70a
Update src/transformers/models/idefics2/modeling_idefics2.py
amyeroberts Mar 11, 2024
0987d15
Fix test
amyeroberts Mar 12, 2024
78ba577
Update config - PR comments and defaults align with checkpoint
amyeroberts Mar 12, 2024
971dd72
Reviewer comments
amyeroberts Mar 12, 2024
d7dfec9
Add copied froms for flahs attention
amyeroberts Mar 12, 2024
097f402
Update src/transformers/models/idefics2/modeling_idefics2.py
amyeroberts Mar 18, 2024
1370836
Apply suggestions from code review
amyeroberts Mar 21, 2024
9dff742
Remove qk_layer_norm and freeze_layers functionality
amyeroberts Mar 21, 2024
0e1be29
Fix
amyeroberts Mar 21, 2024
c334307
Remove freeze_layer options from config
amyeroberts Mar 21, 2024
e5b5bc4
Sync with upstream main
amyeroberts Mar 21, 2024
ec867d8
Fix attention shapes siglip
amyeroberts Mar 22, 2024
0019bf1
Remove Llava-next refs - TO REBASE
amyeroberts Mar 24, 2024
b0e4081
Use AutoModel for text model
amyeroberts Mar 24, 2024
863b2ee
Add comment to explain vision embeddings
amyeroberts Mar 24, 2024
68990f8
Fix issue with tie_word_embeddings
amyeroberts Mar 25, 2024
e1456a0
Address review comments
amyeroberts Mar 25, 2024
f4b45d3
Fix and fix up
amyeroberts Mar 25, 2024
ffb2de3
Chat templates for idefics
amyeroberts Mar 27, 2024
700119d
Fix copies
amyeroberts Mar 27, 2024
cefdd1d
Fix
amyeroberts Mar 27, 2024
4823ecd
Add layer norms to FA2
amyeroberts Mar 27, 2024
2de1098
Fix tests
amyeroberts Mar 27, 2024
5205bba
Apply suggestions from code review
amyeroberts Apr 2, 2024
7edaff5
Fix
amyeroberts Apr 2, 2024
a7a0a2c
Review comments
amyeroberts Apr 2, 2024
16f7666
Update src/transformers/models/idefics2/modeling_idefics2.py
amyeroberts Apr 2, 2024
e3a22e4
Update inputs merger
amyeroberts Apr 2, 2024
1c397b1
Merge weights in correct order
amyeroberts Apr 2, 2024
182ea5f
Update convert script
amyeroberts Apr 3, 2024
0ba4cc4
Update src/transformers/models/idefics2/processing_idefics2.py
amyeroberts Apr 3, 2024
65bf223
Update template
amyeroberts Apr 3, 2024
84ea6e8
Model code examples (fix idefics too)
amyeroberts Apr 3, 2024
ee548af
More review comments
amyeroberts Apr 3, 2024
649563b
Tidy up
amyeroberts Apr 3, 2024
4c4f315
Update processing
amyeroberts Apr 3, 2024
f95e76b
Fix attention mask preparation
amyeroberts Apr 3, 2024
eae3f08
Update inputs_merger inputs
amyeroberts Apr 3, 2024
3043e40
Vectorize inputs_merger
amyeroberts Apr 3, 2024
914fa74
Update src/transformers/models/idefics2/__init__.py
amyeroberts Apr 8, 2024
877109a
Update src/transformers/models/idefics2/modeling_idefics2.py
amyeroberts Apr 8, 2024
9cde5c2
Review comments
amyeroberts Apr 8, 2024
3307e6b
saying bye to the `qk_layer_norms`
VictorSanh Apr 7, 2024
ecaac39
Simplify
amyeroberts Apr 8, 2024
366d21d
Update latents
amyeroberts Apr 8, 2024
5312a80
Remove erroneuous readme changes
amyeroberts Apr 8, 2024
9d1078b
Return images when applying chat template
amyeroberts Apr 8, 2024
b1e2f42
Fix bug - prompt images are for a single sample
amyeroberts Apr 9, 2024
09796a3
Update src/transformers/models/idefics2/modeling_idefics2.py
VictorSanh Apr 10, 2024
eaff6e6
image splitting
VictorSanh Apr 8, 2024
0034f84
fix test
VictorSanh Apr 8, 2024
e2845b1
some more comment
VictorSanh Apr 8, 2024
3ae2a1b
some comment
VictorSanh Apr 8, 2024
833a802
Apply suggestions from code review
VictorSanh Apr 9, 2024
502c3dc
Update src/transformers/models/idefics2/image_processing_idefics2.py
VictorSanh Apr 11, 2024
e8ca7b3
Update processor
amyeroberts Apr 10, 2024
4bde406
Update model tests
amyeroberts Apr 10, 2024
fea200e
Update src/transformers/models/idefics2/processing_idefics2.py
amyeroberts Apr 10, 2024
33e51a6
Update src/transformers/models/idefics2/processing_idefics2.py
amyeroberts Apr 10, 2024
fcad4e4
Don't add BOS in template
amyeroberts Apr 10, 2024
1dc90f0
Update src/transformers/models/idefics2/processing_idefics2.py
amyeroberts Apr 10, 2024
0e16d4a
Remove index in examples
amyeroberts Apr 11, 2024
107693a
Update tests to reflect #13
amyeroberts Apr 11, 2024
cd4f76a
Update src/transformers/models/idefics2/processing_idefics2.py
amyeroberts Apr 11, 2024
31945ee
PR comment - consistent typing
amyeroberts Apr 12, 2024
4ab5e1d
Update readme and model doc
amyeroberts Apr 12, 2024
d8c5045
Update docs
amyeroberts Apr 12, 2024
e8b9751
Update checkpoint references
amyeroberts Apr 12, 2024
b5a7622
Update examples
amyeroberts Apr 12, 2024
7ee8681
Fix and update tests
amyeroberts Apr 12, 2024
31c6634
Small addition
amyeroberts Apr 12, 2024
75f59ef
Update tests - remove copied from as no ignore placement copy could b…
amyeroberts Apr 12, 2024
b5ad135
Update example
amyeroberts Apr 12, 2024
419fba2
small fixes
VictorSanh Apr 13, 2024
ea3838e
Update docs/source/en/model_doc/idefics2.md
amyeroberts Apr 14, 2024
301e1c5
Update docs/source/en/model_doc/idefics2.md
amyeroberts Apr 14, 2024
7b1c4dc
Update README.md
amyeroberts Apr 14, 2024
5be2feb
Connector model as bridge
amyeroberts Apr 12, 2024
34eb76b
Fix up
amyeroberts Apr 14, 2024
7c73ede
Fix up
amyeroberts Apr 14, 2024
455dccf
Don't pass model inputs for generation kwargs update
amyeroberts Apr 15, 2024
3bbd272
IDEFICS-2 -> Idefics2
VictorSanh Apr 15, 2024
b122cb2
Merge pull request #18 from huggingface/vs/name-change
VictorSanh Apr 15, 2024
5414a02
Remove config archive name
amyeroberts Apr 15, 2024
3d84654
IDEFICS-2 -> Idefics2
amyeroberts Apr 15, 2024
8739092
Add back llava-next
amyeroberts Apr 15, 2024
779a8f8
Update readmes
amyeroberts Apr 15, 2024
f8c5301
Add requirements for processor tester
amyeroberts Apr 15, 2024
661b93b
Use custom convert_to_rgb to avoid possible BC
amyeroberts Apr 15, 2024
0efb5e8
Fix doc example
amyeroberts Apr 15, 2024
fed24d1
Fix doc example
amyeroberts Apr 15, 2024
541ce14
Skip model doc tests - as model to large
amyeroberts Apr 15, 2024
2a563b2
More doc example - account for image splitting
amyeroberts Apr 15, 2024
26c8a55
Update src/transformers/image_transforms.py
amyeroberts Apr 15, 2024
16c8317
Fix config doctest
amyeroberts Apr 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
DecoupledLinear takes config
  • Loading branch information
amyeroberts committed Apr 14, 2024
commit 7983e93679bc7519fbb7956de7a90400abfca01a
45 changes: 12 additions & 33 deletions src/transformers/models/idefics2/modeling_idefics2.py
Original file line number Diff line number Diff line change
Expand Up @@ -2737,48 +2737,33 @@ def forward(
)


class Idefics2DecoupledLinear(nn.Linear):
# Derived from https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
class Idefics2DecoupledLinear(nn.Module):
"""
Implements a decoupling of parameters to allow freezing (or not) a subset of the parameters.
In practise, the regular `weight` can be trained or frozen (i.e. `partially_freeze=True`), and if `out_additional_features` > 0, then it will create `out_additional_features * in_features` additional parameters that are always trained.
If `out_additional_features=0`, then the module defaults back to the regular behavior of `nn.Linear`.
"""

def __init__(
self,
in_features: int,
out_features: int,
out_additional_features: int = 0,
bias: bool = True,
partially_freeze: bool = True,
device=None,
dtype=None,
) -> None:
def __init__(self, config) -> None:
"""
out_additional_features: int. Number of additional trainable dimensions. Only makes sense when `partially_freeze=True`.
partially_freeze: bool. If True, the regular `weight` will be frozen and extra parameters (if any) will be trainable. If False, default to the regular behavior of nn.Linear.
"""
super().__init__(in_features, out_features, bias, device, dtype)
self.out_additional_features = out_additional_features
self.partially_freeze = partially_freeze
super().__init__()
self.in_features = config.hidden_size
self.out_features = config.vocab_size
self.out_additional_features = config.additional_vocab_size
self.partially_freeze = config.freeze_lm_head

self.in_features = in_features
self.out_features = out_features
self.linear = nn.Linear(in_features=self.in_features, out_features=self.out_features, bias=False)

if partially_freeze:
self.weight.requires_grad_(False)
self.linear.weight.requires_grad_(False)
if bias:
self.bias.requires_grad_(False)
self.linear.bias.requires_grad_(False)

if out_additional_features > 0:
self.additional_fc = nn.Linear(
in_features=in_features,
out_features=out_additional_features,
bias=bias,
device=device,
dtype=dtype,
)
self.additional_fc = nn.Linear(in_features=self.in_features, out_features=self.out_additional_features, bias=bias)

def forward(self, input: torch.Tensor) -> torch.Tensor:
output = F.linear(input, self.weight, self.bias)
Expand All @@ -2797,13 +2782,7 @@ def __init__(self, config):
super().__init__(config)
self.model = Idefics2Model(config)
self.image_token_id = self.config.image_token_id
self.lm_head = Idefics2DecoupledLinear(
in_features=config.hidden_size,
out_features=config.vocab_size,
out_additional_features=config.additional_vocab_size,
bias=False,
partially_freeze=config.freeze_lm_head,
)
self.lm_head = Idefics2DecoupledLinear(config)

# Initialize weights and apply final processing
self.post_init()
Expand Down