merge dev to main by peaceforeverCN · Pull Request #65 · taco-project/FlexKV

peaceforeverCN · 2025-11-25T07:56:50Z

No description provided.

[bugfix] quick fix of match_prefix

Signed-off-by: Zhaohu Xing <x.zhaohu@gmail.com>

[misc] Replace std::map with std::unordered_map in RadixTree

* add global config from env * use config from env * simplify port config * remove max_req_tokens * simple user config * update flexkv_config * fix benchmark * remove unused example * modify config doc * fix iouring flag && allow user config override env * update all docs * rename layout type * small fix * update tracer * adjust ssd blocks num if necessary * fix broken tests --------- Co-authored-by: linhu-nv <linhu@nvidia.com>

Co-authored-by: jianyingzhu <joeyzhu@nvidia.com>

* [refactor] gds reuse ssd handle * refactor tpGDS * minor naming changed * gtensor gds support * update doc & test

Support construct TensorSharedHandle directly from CUDA IPC Handle

add scripts for vllm adapter

* fix benchmark * fix incorrect MatchResult * use int64_t for offset * fix bugs && update docs * update config file * fix env name * remove useless exceptions

* prevent automatically initializing MPI * disable auto-mpi-init * Init support for TensorRT-LLM * add scripts * fix import and interface * support the trtllm gpu layout and improve register api of trt_adapter * modify log * modify scripts * use remote transfermanager * some fix by hulin * using subprocess instead of multiprocessing * fix dead lock * fix some bugs about gpu_register_port * fix tensor export * fix head_size * fix num_kv_heads for deepseek * fix ipc open error * fix head_size calculationg error * fix interface * fix get num_matched_tokens from trtllm * fix head_size calculationg error * fix interface * fix short len * remove code * add patch file * modify scripts * tensorRT LLM will wait until kvmanager isready * [bugfix] fix token alignment issue in tensorrt-llm by rounding down to block size * trivial * support flexkv + cuda graph using flexkv * modify patch * modify scripts * [bugfix] fix some bug fix bug * fix redix_tree * modify scripts * add debug log * modify scripts * fix rebase error * fix radix tree * fix scripts * use new config * rename * fix script * add branch for calculation of aligned_length * add branch for remote_process * take another way to determine branch * fux scripts * remove useless env and config * remove useless commit --------- Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com> Co-authored-by: linhu-nv <linhu@nvidia.com> Co-authored-by: Luis-xu <hfutxjn@163.com> Co-authored-by: annz <annz@nvidia.com> Co-authored-by: leolingli <leolingli@tencent.com>

* [bugfix] fix for deepseek head number wrong * [bugfix] fix bug, if cpu match len is bigger than ssd when put it will cause error * fix redix_tree (#39) * fix empty --------- Co-authored-by: leolingli <leolingli@tencent.com>

* add patch * init docs * fin readme * rename yml * fix readme * fix readme * update docs * fix docs * fix docs * fix docs * add title * add readme_en * Update docs/trtllm_adaption/README_en.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix bug for tp16 * update vllm adapter to support tp16

linhu-nv and others added 30 commits May 14, 2025 15:52

fix logic problems in client

7d63547

quick fix to client-server example

dc00c77

fix some bugs to run compile

45bc9f8

modify blockmeta definition & impl mempool

4414e83

impl index

3cbb19d

add index benchmark

409ef8a

faster hash

7d90ee2

add test for storage_engine + transfer_engine

d4c5a39

optim index

ca90a73

fix a few bugs about performance, should have normal perf now

6489c3b

add mempool benchmark

0866932

optim mempool

9550fa5

optimize index.insert

15bfaff

fix mempool

cbea324

eviction implementation and optimization

117ad1a

fix evictor

69167f7

flatten free ids tensor

fcc4e87

impl get/put pipeline

be2737e

add reset for cache engine

7ae1876

global id allocator

143c261

list to tensor

0bff033

init kvmanager

7c66cc7

run the pipeline

69410b6

print cpu-gpu transfer info

e6ba7d8

refactor index

4d26d16

refactor kvmanager and cache engine

3bd3119

add insert_length for insert

5cf26ff

cleanup buffer

6ce0002

Layer wise

bd7a07d

Added the xxhash

242c557

charliecgxu and others added 23 commits November 10, 2025 19:55

Merge pull request #40 from linhu-nv/match_prefix_fix

8fbb030

[bugfix] quick fix of match_prefix

[misc] Replace std::map with std::unordered_map in RadixTree

5209a8d

Signed-off-by: Zhaohu Xing <x.zhaohu@gmail.com>

Merge pull request #41 from zhaohuxing/dev

a22ce0b

[misc] Replace std::map with std::unordered_map in RadixTree

quick fix of config (#43)

12984a7

Co-authored-by: jianyingzhu <joeyzhu@nvidia.com>

[feature] GDS refactor & gtensor support (#42)

e5fcf1c

* [refactor] gds reuse ssd handle * refactor tpGDS * minor naming changed * gtensor gds support * update doc & test

Support construct TensorSharedHandle directly from CUDA IPC Handle

1c9eda5

add test file for TensorSharedHandle

34dbbc2

Merge pull request #44 from peaceforeverCN/feature/HandleRegister

4fba46a

Support construct TensorSharedHandle directly from CUDA IPC Handle

add scripts for vllm adapter

1237c4b

[bugfix] fix port (#45)

a30e759

[bugfix] fix ssd allocator (#46)

a423a55

Merge pull request #47 from axxx03/rongwei/fix_vllm_adapter

dad3107

add scripts for vllm adapter

[bug fix] fix some bugs && cleanup code (#49)

06051e9

* fix benchmark * fix incorrect MatchResult * use int64_t for offset * fix bugs && update docs * update config file * fix env name * remove useless exceptions

quick fix

c8b0896

Rebase and merge bugfix to dev (#51)

b3a965a

* [bugfix] fix for deepseek head number wrong * [bugfix] fix bug, if cpu match len is bigger than ssd when put it will cause error * fix redix_tree (#39) * fix empty --------- Co-authored-by: leolingli <leolingli@tencent.com>

quick fix

1982243

Fix bug found by unit test (#55)

aa0a865

[bugfix] put trtllm env set before kvmanager init (#58)

faca3f0

[feature] support Tp16 for vllm+flexkv (#59)

4fa0047

* fix bug for tp16 * update vllm adapter to support tp16

quick fix for tp16 (#62)

8836175

linhu-nv requested review from linhu-nv and zhuofan1123 and removed request for linhu-nv November 25, 2025 08:00

linhu-nv approved these changes Nov 25, 2025

View reviewed changes

zhuofan1123 approved these changes Nov 25, 2025

View reviewed changes

peaceforeverCN merged commit 33dd38b into main Nov 25, 2025
1 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

merge dev to main#65

merge dev to main#65
peaceforeverCN merged 252 commits intomainfrom
dev

peaceforeverCN commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Comments

Conversation

peaceforeverCN commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants