Skip to content

[Inference/Kernel] Optimize paged attention: Refactor key cache layout#5643

Merged
SunflowerAries merged 2 commits intohpcaitech:feature/colossal-inferfrom
SunflowerAries:optimize-paged-attn
Apr 25, 2024
Merged

[Inference/Kernel] Optimize paged attention: Refactor key cache layout#5643
SunflowerAries merged 2 commits intohpcaitech:feature/colossal-inferfrom
SunflowerAries:optimize-paged-attn