Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Add Mamba] Adds support for the Mamba models #28094

Merged
merged 123 commits into from
Mar 5, 2024
Merged
Changes from 1 commit
Commits
Show all changes
123 commits
Select commit Hold shift + click to select a range
81c642f
initial-commit
ArthurZucker Dec 16, 2023
c50602b
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Jan 31, 2024
00d3a6c
start cleaning
ArthurZucker Jan 31, 2024
921bb24
small nits
ArthurZucker Feb 1, 2024
b3f216d
small nits
ArthurZucker Feb 3, 2024
7235b57
current updates
ArthurZucker Feb 3, 2024
7a407a7
add kernels
ArthurZucker Feb 5, 2024
9f2a982
small refactoring little step
ArthurZucker Feb 5, 2024
04c991a
add comments
ArthurZucker Feb 5, 2024
aa7e8d2
styling
ArthurZucker Feb 5, 2024
26748c4
nit
ArthurZucker Feb 5, 2024
75e376a
nits
ArthurZucker Feb 14, 2024
1c104b5
Style
ArthurZucker Feb 14, 2024
0e90dae
Merge
ArthurZucker Feb 14, 2024
a804466
Small changes
ArthurZucker Feb 14, 2024
6b87ad2
Push dummy mambda simple slow
ArthurZucker Feb 14, 2024
a7ec8d6
nit
ArthurZucker Feb 14, 2024
5046451
Use original names
ArthurZucker Feb 14, 2024
b5831e3
Use original names and remove norm
ArthurZucker Feb 15, 2024
e9a80ad
Updates for inference params
ArthurZucker Feb 15, 2024
ee4a7ef
Style nd updates
ArthurZucker Feb 15, 2024
d8c195f
nits
ArthurZucker Feb 15, 2024
e64fedc
Match logits
ArthurZucker Feb 16, 2024
aee558f
Add a test
ArthurZucker Feb 16, 2024
eae5f45
Add expected generated text
ArthurZucker Feb 16, 2024
1f8e8d0
nits doc, imports and styling
ArthurZucker Feb 16, 2024
3cc06e5
style
ArthurZucker Feb 16, 2024
5a5324c
oups
ArthurZucker Feb 16, 2024
325b66b
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 16, 2024
81303f4
dont install kernels, invite users to install the required kernels
ArthurZucker Feb 19, 2024
1a10310
let use use the original packages
ArthurZucker Feb 19, 2024
89fb490
styling
ArthurZucker Feb 19, 2024
6cfe216
nits
ArthurZucker Feb 19, 2024
1ecbd22
fix some copieds
ArthurZucker Feb 19, 2024
b937122
update doc
ArthurZucker Feb 19, 2024
9752dd0
fix-copies
ArthurZucker Feb 19, 2024
a7881a3
styling done
ArthurZucker Feb 19, 2024
f445b0d
nits
ArthurZucker Feb 19, 2024
64ec8dd
fix import check
ArthurZucker Feb 19, 2024
e6e3ba8
run but wrong cuda ress
ArthurZucker Feb 19, 2024
ed4eb4c
mamba CUDA works :)
ArthurZucker Feb 19, 2024
4c8fc48
fix the fast path
ArthurZucker Feb 19, 2024
69e103f
config naming nits
ArthurZucker Feb 19, 2024
ba21ff2
conversion script is not required at this stage
ArthurZucker Feb 19, 2024
fe53728
finish fixing the fast path: generation make sense now!
ArthurZucker Feb 19, 2024
9411169
nit
ArthurZucker Feb 19, 2024
c2c7709
Let's start working on the CIs
ArthurZucker Feb 19, 2024
1e73ca9
style
ArthurZucker Feb 19, 2024
834f46f
git push Merge branch 'main' of github.com:huggingface/transformers i…
ArthurZucker Feb 19, 2024
a1a94f3
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 20, 2024
2213222
better style
ArthurZucker Feb 20, 2024
2a02006
more nits
ArthurZucker Feb 20, 2024
8b0412f
test nit
ArthurZucker Feb 20, 2024
fbd6a2c
quick fix for now
ArthurZucker Feb 20, 2024
823f11a
nits
ArthurZucker Feb 20, 2024
88896a9
nit
ArthurZucker Feb 20, 2024
7f72ee8
nit
ArthurZucker Feb 21, 2024
0555247
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 29, 2024
0072a6c
nit
ArthurZucker Feb 29, 2024
7f6c56f
nits
ArthurZucker Feb 29, 2024
f67c353
update test rest
ArthurZucker Feb 29, 2024
2ab5a86
fixup
ArthurZucker Feb 29, 2024
8920be3
update test
ArthurZucker Feb 29, 2024
87d0664
nit
ArthurZucker Feb 29, 2024
8b00d76
some fixes
ArthurZucker Feb 29, 2024
ca9835c
nits
ArthurZucker Feb 29, 2024
796ef3e
update test values
ArthurZucker Feb 29, 2024
170664a
fix styling
ArthurZucker Feb 29, 2024
92493a0
nit
ArthurZucker Feb 29, 2024
854ebad
support peft
ArthurZucker Feb 29, 2024
3bbd1b1
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 29, 2024
aa0e6bb
integrations tests require torchg
ArthurZucker Feb 29, 2024
3c1537e
also add slow markers
ArthurZucker Feb 29, 2024
d06421a
styling
ArthurZucker Feb 29, 2024
5fb8062
chose forward wisely
ArthurZucker Feb 29, 2024
edb4e91
nits
ArthurZucker Feb 29, 2024
eb1fb64
update tests
ArthurZucker Feb 29, 2024
de4fe46
fix gradient checkpointing
ArthurZucker Feb 29, 2024
54ffaa3
fixup
ArthurZucker Feb 29, 2024
977d34f
nit
ArthurZucker Feb 29, 2024
0928453
fix doc
ArthurZucker Feb 29, 2024
2c90536
check copies
ArthurZucker Feb 29, 2024
4ba9c79
fix the docstring
ArthurZucker Feb 29, 2024
3651dba
fix some more tests
ArthurZucker Feb 29, 2024
426e6f3
style
ArthurZucker Feb 29, 2024
951b1aa
fix beam search
ArthurZucker Mar 1, 2024
4101369
add init schene
ArthurZucker Mar 1, 2024
65db96b
update
ArthurZucker Mar 1, 2024
0f3dfc7
nit
ArthurZucker Mar 1, 2024
f8bd0aa
fix
ArthurZucker Mar 1, 2024
b2bd0c7
fixup the doc
ArthurZucker Mar 1, 2024
cf58529
fix the doc
ArthurZucker Mar 1, 2024
e9c3447
fixup
ArthurZucker Mar 1, 2024
1282a75
tentative update but slow is no longer good
ArthurZucker Mar 1, 2024
fa561b2
nit
ArthurZucker Mar 1, 2024
91b8106
should we always use float32?
ArthurZucker Mar 1, 2024
e8142ca
nits
ArthurZucker Mar 1, 2024
623b636
revert wrong changes
ArthurZucker Mar 1, 2024
566c799
res in float32
ArthurZucker Mar 1, 2024
5d637d9
cleanup
ArthurZucker Mar 2, 2024
648a292
skip fmt for now
ArthurZucker Mar 2, 2024
e306e89
update generation values
ArthurZucker Mar 2, 2024
057d7a3
update test values running original model
ArthurZucker Mar 2, 2024
72f8936
fixup
ArthurZucker Mar 2, 2024
f415081
update tests + rename inference_params to cache_params + make sure tr…
ArthurZucker Mar 4, 2024
6bb659a
small nits
ArthurZucker Mar 4, 2024
178fe76
more nits
ArthurZucker Mar 4, 2024
3a46724
fix final CIs
ArthurZucker Mar 4, 2024
13204e0
style
ArthurZucker Mar 4, 2024
1608a90
nit doc
ArthurZucker Mar 4, 2024
99119ba
I hope final doc nits
ArthurZucker Mar 4, 2024
d6fb1ef
nit
ArthurZucker Mar 4, 2024
844530f
🫠
ArthurZucker Mar 4, 2024
52be018
final touch!
ArthurZucker Mar 4, 2024
d03de1c
fix torch import
ArthurZucker Mar 4, 2024
c0672a8
Apply suggestions from code review
ArthurZucker Mar 5, 2024
dfc1212
Apply suggestions from code review
ArthurZucker Mar 5, 2024
acd4ccf
fix fix and fix
ArthurZucker Mar 5, 2024
2ddd9aa
fix base model prefix!
ArthurZucker Mar 5, 2024
0c5d7ed
nit
ArthurZucker Mar 5, 2024
28e5ef0
Update src/transformers/models/mamba/__init__.py
ArthurZucker Mar 5, 2024
f963e38
Update docs/source/en/model_doc/mamba.md
ArthurZucker Mar 5, 2024
095dabd
nit
ArthurZucker Mar 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
res in float32
  • Loading branch information
ArthurZucker committed Mar 1, 2024
commit 566c799c28bf66eb01e5df5cb6258d2303e43fdd
2 changes: 1 addition & 1 deletion src/transformers/models/mamba/modeling_mamba.py
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ def __init__(self, config, layer_idx):
def forward(self, hidden_states, inference_params=None):
residual = hidden_states
hidden_states = self.norm(hidden_states.to(dtype=self.norm.weight.dtype))
if self.residual_in_fp32 or True:
if self.residual_in_fp32:
residual = residual.to(torch.float32)

hidden_states = self.mixer(hidden_states, inference_params=inference_params)
Expand Down
Loading