Skip to content

Conversation

@tengomucho
Copy link
Collaborator

This is actually a ripoff of the work originally done as a contribution to transformers:

huggingface/transformers#31129

The original contribution has not been merged yet, but it shows lower memory usage and better performance on XLA. So I think it's worth adding it here.

@tengomucho tengomucho force-pushed the lower-memory-static-cache branch from 02a8556 to 9215b0d Compare July 8, 2024 12:46
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This is actually a ripoff of the work originally done as a contribution
to transformers:

huggingface/transformers#31129

The original contribution has not been merged yet, but it shows lower
memory usage and better performance on XLA. So I think it's worth adding
it here, to be integrated on optimum-tpu.
@tengomucho tengomucho force-pushed the lower-memory-static-cache branch from 9215b0d to 4ffdb17 Compare July 8, 2024 12:57
@tengomucho tengomucho marked this pull request as ready for review July 8, 2024 14:00
@tengomucho tengomucho merged commit 77bebf8 into main Jul 9, 2024
@tengomucho tengomucho deleted the lower-memory-static-cache branch July 9, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants