Skip to content

[Neuron][Kernel] Support Longer Sequences in NKI-based Flash PagedAttention and Improve Efficiency#12921

Merged
simon-mo merged 7 commits intovllm-project:mainfrom lingfanyu:nki_pa_improveFeb 12, 2025

Commits

Commits on Feb 7, 2025

Commits on Feb 9, 2025