-
Notifications
You must be signed in to change notification settings - Fork 178
Issues: flashinfer-ai/flashinfer
Deprecation Notice: Python 3.8 Wheel Support to End in future...
#682
opened Dec 18, 2024 by
yzh119
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
How to use low-bit KV Cache in flashinfer?
enhancement
New feature or request
#125
opened Feb 18, 2024 by
zhaoyang-star
stack smashing detected in begin_forward when compiling directly from the repo
#166
opened Mar 8, 2024 by
mkrima
Circular import error when importing built-from-source flashinfer
#248
opened May 15, 2024 by
vedantroy
CUDA Error: no kernel image is available for execution on the device (209) /tmp/build-via-sdist-nl8se4dx/flashinfer-0.0.4+cu118torch2.2/include/flashinfer/attention/decode.cuh: line 871 at function cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size)
#249
opened May 16, 2024 by
lucasjinreal
[FEAT REQ][CUDA GRAPH] Allow explicit control flag to force enable/disable split KV
#397
opened Jul 26, 2024 by
AgrawalAmey
Runtime error with single_prefill_with_kv_cache while Compilation
#541
opened Oct 20, 2024 by
YudiZh
ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode'
#549
opened Oct 23, 2024 by
Hutlustc
[Feature Request] Add an argument to control the number of CTAs used in attention APIs
#591
opened Nov 7, 2024 by
yzh119
C++ benchmarks CMake error caused by enable_fp16 option in generate.py
#734
opened Jan 13, 2025 by
rtxxxpro
Previous Next
ProTip!
no:milestone will show everything without a milestone.