-
Notifications
You must be signed in to change notification settings - Fork 178
Issues: flashinfer-ai/flashinfer
Deprecation Notice: Python 3.8 Wheel Support to End in future...
#682
opened Dec 18, 2024 by
yzh119
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
C++ benchmarks CMake error caused by enable_fp16 option in generate.py
#734
opened Jan 13, 2025 by
rtxxxpro
[RFC]: Introducing ReproSpec for Strong Reproducibility in LLM Inference
#733
opened Jan 11, 2025 by
yzh119
Inconsistent results between different sequences with sequence lengths less than a single page size
#725
opened Jan 8, 2025 by
fergusfinn
RuntimeError: Qwen2-VL does not support _Backend.FLASHINFER backend now
#720
opened Jan 7, 2025 by
duzw9311
[Question] How to support custom stride of paged_kv for hopper prefill attention
#702
opened Dec 27, 2024 by
jianfei-wangg
Deprecation Notice: Python 3.8 Wheel Support to End in future releases
#682
opened Dec 18, 2024 by
yzh119
[Bug] FlashInfer latest main wheel issue
bug
Something isn't working
priority: high
#669
opened Dec 16, 2024 by
zhyncs
[Question] Overflow risks when batch size and sequence length grows extremely large
#596
opened Nov 8, 2024 by
rchardx
[Feature Request] Add an argument to control the number of CTAs used in attention APIs
#591
opened Nov 7, 2024 by
yzh119
ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode'
#549
opened Oct 23, 2024 by
Hutlustc
Runtime error with single_prefill_with_kv_cache while Compilation
#541
opened Oct 20, 2024 by
YudiZh
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.