forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 76
Pull requests: vllm-project/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[BugFix] Fix FA2 RuntimeError when sinks is provided
#76
by LucasWilkinson
was merged Aug 6, 2025
Loading…
cmake: get rid of empty VLLM_FA_GPU_ARCHES variable
#74
by dtrifiro
was merged Jul 31, 2025
Loading…
Sparse attention : Generalize arch checks for A100 and above
#73
by ExtReMLapin
was merged Jul 28, 2025
Loading…
[Misc] Add num_splits input arg to flash_attn_varlen_func
#72
by WoosukKwon
was merged Jul 1, 2025
Loading…
[BugFix] Fix raising exception when FA3 isn't available
#66
by LucasWilkinson
was merged Apr 24, 2025
Loading…
FA3 Decode Perf - Use single mma warp group for decode batches
#63
by LucasWilkinson
was merged Apr 21, 2025
Loading…
[WIP] Use single mma warp group for decode batches
#62
by LucasWilkinson
was closed Apr 18, 2025
•
Draft
Upstream Sync - up to: d836a6bf09bf3838c6e71c9cf675b3708fea0d71
#61
by LucasWilkinson
was merged Apr 10, 2025
Loading…
[Easy] replace c10::optional with std::optional
#58
by yeqcharlotte
was merged Mar 27, 2025
Loading…
Upstream Sync - up to: 27f501dbe011f4371bff938fe7e09311ab3002fa
#56
by LucasWilkinson
was merged Mar 20, 2025
Loading…
Upstream Sync | up to 06e34f62d18d3a721bc515d4b331a46d5d4c8c09
#52
by LucasWilkinson
was merged Feb 26, 2025
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.