Skip to content

kv-cache : relax SWA masking condition #14119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2025
Merged

Conversation

ggerganov
Copy link
Member

fix #14111

Allow to insert the ubatch in any SWA-masked position - not only at the minimum.

Before, we required that the new ubatch could overwrite only the minimum sequence positions in order to guarantee that all tokens in the range [pos_min(s), pos_max(s)] will always be present in the cache. Now we can insert the ubatch at any position as long as it is SWA-maksed. In order to preserve the invariant, we have to purge any tokens with positions smaller than those that were overwritten by the ubatch.

@ggerganov ggerganov force-pushed the gg/swa-relax-masking branch from c121b6e to 989b8c8 Compare June 11, 2025 12:51
@ggerganov ggerganov merged commit 89a184f into master Jun 11, 2025
43 of 47 checks passed
@ggerganov ggerganov deleted the gg/swa-relax-masking branch June 11, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: 10 Image maximum?
1 participant