flashinfer-ai / flashinfer Public

Notifications You must be signed in to change notification settings
Fork 178
Star 1.8k

Code
Issues 57
Pull requests 7
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: flashinfer-ai/flashinfer

[Roadmap] FlashInfer v0.2 to v0.3

#675 opened Dec 17, 2024 by yzh119

Open

Deprecation Notice: Python 3.8 Wheel Support to End in future...

#682 opened Dec 18, 2024 by yzh119

Open 2

Labels 15 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

44 Open 95 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

apply_rope_inplace will cause graphbreak due to mutated inputs

#403 opened Jul 28, 2024 by jianc99

Basic inference example for LLama/Mistral

#108 opened Feb 5, 2024 by vgoklani

How to use low-bit KV Cache in flashinfer? enhancement

New feature or request

#125 opened Feb 18, 2024 by zhaoyang-star

[Feature Request] Versatile head dimension

#142 opened Feb 29, 2024 by yzh119

stack smashing detected in begin_forward when compiling directly from the repo

#166 opened Mar 8, 2024 by mkrima

Shared-prefix rope issue

#194 opened Apr 1, 2024 by lkc1997

[LoRA] Roadmap of LoRA operators

#199 opened Apr 8, 2024 by yzh119

3 tasks

Support MLA (Multi-Head Latent Attention) in DeepSeek-v2

#237 opened May 7, 2024 by yzh119

multiple definition of `cuda::__3::pipeline...

#245 opened May 14, 2024 by jpf888

Circular import error when importing built-from-source flashinfer

#248 opened May 15, 2024 by vedantroy

CUDA Error: no kernel image is available for execution on the device (209) /tmp/build-via-sdist-nl8se4dx/flashinfer-0.0.4+cu118torch2.2/include/flashinfer/attention/decode.cuh: line 871 at function cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size)

#249 opened May 16, 2024 by lucasjinreal

pytorch 2.4 support

#395 opened Jul 25, 2024 by yzh119

1 of 2 tasks

[Tracking Issue] Setting up CI and Performance Regression Testing

#65 opened Jan 10, 2024 by yzh119

[FEAT REQ][CUDA GRAPH] Allow explicit control flag to force enable/disable split KV

#397 opened Jul 26, 2024 by AgrawalAmey

[Feat] Suggest of using MappingUtils to compute coordiantes automatically for different warpSize

#512 opened Sep 27, 2024 by yiakwy-xpu-ml-framework-team

Compilation issue on old cuda versions

#514 opened Oct 3, 2024 by yzh119

Is lean attention supported by flash infer?

#515 opened Oct 8, 2024 by sleepwalker2017

Question about use_tensor_cores = True or False

#520 opened Oct 10, 2024 by sleepwalker2017

[Question] Sampling kernel only support FP32 now?

#531 opened Oct 15, 2024 by yz-tang

Runtime error with single_prefill_with_kv_cache while Compilation

#541 opened Oct 20, 2024 by YudiZh

ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode'

#549 opened Oct 23, 2024 by Hutlustc

[Feature request] Adding optional cpu_indptr/cpu_qo_indptr parameter to plan method to avoid synchronized device to host copy.

#565 opened Oct 28, 2024 by reyoung

Have any plans to optimize the decode kernel for NV-Hopper

#576 opened Oct 31, 2024 by JamesLim-sy

[Feature Request] Add an argument to control the number of CTAs used in attention APIs

#591 opened Nov 7, 2024 by yzh119

C++ benchmarks CMake error caused by enable_fp16 option in generate.py

#734 opened Jan 13, 2025 by rtxxxpro

Previous 1 2 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly