Skip to content

Conversation

@tohtana
Copy link
Collaborator

@tohtana tohtana commented Jan 28, 2026

(Proposing as an alternative of #7821)
This PR adds disable_in_eval parameter to UlyssesSPAttentionHF that bypasses sequence parallelism operations during evaluation.
See huggingface/transformers#43517 and #7821 for the context.

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants