-
Notifications
You must be signed in to change notification settings - Fork 184
Comparing changes
Open a pull request
base repository: flashinfer-ai/flashinfer
base: v0.0.6
head repository: flashinfer-ai/flashinfer
compare: v0.0.7
- 16 commits
- 52 files changed
- 3 contributors
Commits on Jun 22, 2024
-
ci: separate
update_whl_index
from github action files (#328)and bump doc version.
Configuration menu - View commit details
-
Copy full SHA for 1df7b03 - Browse repository at this point
Copy the full SHA 1df7b03View commit details -
Configuration menu - View commit details
-
Copy full SHA for f237f5f - Browse repository at this point
Copy the full SHA f237f5fView commit details
Commits on Jun 23, 2024
-
doc: bugfix on documentation about mask usage (#331)
This PR should fix #330 .
Configuration menu - View commit details
-
Copy full SHA for 947830b - Browse repository at this point
Copy the full SHA 947830bView commit details
Commits on Jun 24, 2024
-
bugfix: fix the scheduler behavior of large batch size (#333)
when `128 / page == 0`, our binary search might run into division by zero issue.
Configuration menu - View commit details
-
Copy full SHA for 4d08c63 - Browse repository at this point
Copy the full SHA 4d08c63View commit details -
Configuration menu - View commit details
-
Copy full SHA for ea89492 - Browse repository at this point
Copy the full SHA ea89492View commit details
Commits on Jun 27, 2024
-
perf: more options for kv tile size (#336)
For small query size setting, we might use large kv tile size.
Configuration menu - View commit details
-
Copy full SHA for bf2a6c7 - Browse repository at this point
Copy the full SHA bf2a6c7View commit details
Commits on Jun 28, 2024
-
bugfix: fix the
forward_return_lse
function in `BatchPrefillWithRag……gedKVCache` class (#337) Add more tests for coverage.
Configuration menu - View commit details
-
Copy full SHA for 10e6b17 - Browse repository at this point
Copy the full SHA 10e6b17View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3afb6d3 - Browse repository at this point
Copy the full SHA 3afb6d3View commit details -
feat: customize
logits_soft_cap
value (#339)This PR supports customized logits soft cap values. Different models might use different logits soft cap values (e.g. Grok-1 uses 30 and Gemma-2 uses 50).
Configuration menu - View commit details
-
Copy full SHA for a2498f5 - Browse repository at this point
Copy the full SHA a2498f5View commit details -
chore(main): release 0.0.7 (#327)
🤖 I have created a release *beep* *boop* --- ## [0.0.7](v0.0.6...v0.0.7) (2024-06-28) ### Bugfix * fix the `forward_return_lse` function in `BatchPrefillWithRaggedKVCache` class ([#337](#337)) * fix the scheduler behavior of large page size ([#333](#333)) ### Features * customize `logits_soft_cap` value ([#339](#339)) ([a2498f5](a2498f5)) ### Performance Improvements * change minimal `kv_chunk_size` back to 128 ([#329](#329)) ([f237f5f](f237f5f)) * more options for kv tile size ([#336](#336)) ([bf2a6c7](bf2a6c7)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Zihao Ye <expye@outlook.com>
Configuration menu - View commit details
-
Copy full SHA for 95f507f - Browse repository at this point
Copy the full SHA 95f507fView commit details -
[CMake][Bugfix] Set default value for FLASHINFER_GEN_MASK_MODES (#340)
This commit resolves a build-time error with the following message: ``` CMake Error at 3rdparty/flashinfer/CMakeLists.txt:313 (add_library): No SOURCES given to target: prefill_kernels ``` This occurred after #266, which replaces the `FLASHINFER_GEN_CASUALS` option with `FLASHINFER_GEN_MASK_MODES`. However, the definition of `flashinfer_option(FLASHINFER_GEN_CASUALS ... )` was not replaced. As a result, loop over the empty `MASK_MODES` does not produce any kernels that should be compiled. This commit updates the `flashinfer_option(FLASH_GEN_CASUALS ...)` line to instead define `FLASH_GEN_MASK_MODES`, using the same default value as `config.cmake`.
Configuration menu - View commit details
-
Copy full SHA for df59f71 - Browse repository at this point
Copy the full SHA df59f71View commit details -
linker: use
mcmodel=medium
and--no-relax
to compilation flags fo……r large wheels (#341)
Configuration menu - View commit details
-
Copy full SHA for 457eb78 - Browse repository at this point
Copy the full SHA 457eb78View commit details
Commits on Jun 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for e0a233a - Browse repository at this point
Copy the full SHA e0a233aView commit details
Commits on Jun 30, 2024
-
refactor: reduce the binary size of batch decode kernels (#343)
This PR refactors the batch decode related kernels, and make the following breaking changes: 1. remove `batch_decode_with_padded_kv_cache` operator, we encourage user to use `BatchDecodeWithPagedKVCacheWrapper`. 2. Delete redundant DTypeQ * DTypeKV combinations, now we only support the following cases: 1. DTypeQ == DTypeKV 2. DTypeQ is a float16 and DTypeKV is a float8 The output data type follows the query data type.
Configuration menu - View commit details
-
Copy full SHA for 0d333ff - Browse repository at this point
Copy the full SHA 0d333ffView commit details -
Also reduce binary size but limit the maximum number of registers for `x_frag` and `o_frag` to 200.
Configuration menu - View commit details
-
Copy full SHA for 80a376f - Browse repository at this point
Copy the full SHA 80a376fView commit details -
ci: remove redundant
NUM_FRAGS_Z
(#345)Do not compile `NUM_FRAGS_Z=6` to reduce wheel size. Also revert #341 as they don't make effect.
Configuration menu - View commit details
-
Copy full SHA for fec77d0 - Browse repository at this point
Copy the full SHA fec77d0View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v0.0.6...v0.0.7