-
Notifications
You must be signed in to change notification settings - Fork 24
Bump transformers and torch #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
89ed1c5
to
3a960bb
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
867eb8b
to
300ccdf
Compare
300ccdf
to
bc82841
Compare
b110649
to
35fc918
Compare
d89e18d
to
6a26464
Compare
|
||
# Create a list of CustomKVCache instances, one per layer | ||
self.kv_cache = torch.nn.ModuleList() | ||
for _ in range(config.num_hidden_layers): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened here? like config doesnt exist anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still exists, feel like it's more idiomatic to iterate over the actual layers
93cbd54
to
64b41b4
Compare
b0027f5
to
eccf6f0
Compare
eccf6f0
to
87fe6e3
Compare
87fe6e3
to
99805f8
Compare
This reverts commit 99805f8.
fc7b69e
to
59778eb
Compare
self._temp_dir = None | ||
|
||
def __del__(self): | ||
"""Clean up temporary files when the model instance is destroyed.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt this already happen automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah probably, but added just to be extra sure that it's cleaned up between tests
330ca8d
to
b252038
Compare
n_heads=self.num_key_value_heads, | ||
head_dim=self.head_dim, | ||
max_batch_size=layer.max_batch_size, | ||
max_context_length=layer.max_cache_len, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait what is happening here? is this same as sliding_window_len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they removed sliding_window_len
, it's now just max_cache_len
https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L357
https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L265
Summary
Pin bumps
20250601
4.54.1
Code changes
Includes changes to absorb the huggingface/transformers#39106 kv cache refactor introducewd by the transformers upgrade, which specifies kv cache attributes per layer.
cache_config
is also no longer aCacheConfig
instance but adict
after this PR, so we change to using.get()
Infra changes
Remove mac tests, see #122 for more details. This also allows us to iterate more quickly by cutting down unnecessary CI, since there's technically no need to run on Mac to test export when Linux tests already cover that. Mac tests with larger runners are enabled reciprocally for major LLM models in ExecuTorch in pytorch/executorch#13400.
Known failures