-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: Dao-AILab/flash-attention
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Error in line 21 O_i adjustment in Algorithm 1 in FlashAttention-3 Paper
#1303
opened Oct 30, 2024 by
hasanunlu
flash_attn_with_kvcach return block_lse or attention_score
#1301
opened Oct 28, 2024 by
NonvolatileMemory
FlashSelfAttention and SelfAttention in flash_attn.modules.mha give different results
#1300
opened Oct 28, 2024 by
senxiu-puleya
In unit test,how is the dropout_fraction diff tolerance selected?
#1286
opened Oct 18, 2024 by
muoshuosha
FlashAttention installation error: "CUDA 11.6 and above" requirement issue
#1282
opened Oct 17, 2024 by
21X5122
Unable to import my new kernel function after compilation success.
#1278
opened Oct 15, 2024 by
jpli02
Why does the flash_attn_varlen_func method increase GPU memory usage?
#1277
opened Oct 15, 2024 by
shaonan1993
Is there a way to install flash-attention without specific cuda version ?
#1276
opened Oct 14, 2024 by
HuangChiEn
Concurrent Warp Group Execution in FA3: Tensor Core Resource Limitation?
#1275
opened Oct 13, 2024 by
ziyuhuang123
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.