Skip to content

Comments

merge dev to main#65

Merged
peaceforeverCN merged 252 commits intomainfrom
dev
Nov 25, 2025
Merged

merge dev to main#65
peaceforeverCN merged 252 commits intomainfrom
dev

Conversation

@peaceforeverCN
Copy link
Collaborator

No description provided.

charliecgxu and others added 23 commits November 10, 2025 19:55
[bugfix] quick fix of match_prefix
Signed-off-by: Zhaohu Xing <x.zhaohu@gmail.com>
[misc] Replace std::map with std::unordered_map in RadixTree
* add global config from env

* use config from env

* simplify port config

* remove max_req_tokens

* simple user config

* update flexkv_config

* fix benchmark

* remove unused example

* modify config doc

* fix iouring flag && allow user config override env

* update all docs

* rename layout type

* small fix

* update tracer

* adjust ssd blocks num if necessary

* fix broken tests

---------

Co-authored-by: linhu-nv <linhu@nvidia.com>
Co-authored-by: jianyingzhu <joeyzhu@nvidia.com>
* [refactor] gds reuse ssd handle

* refactor tpGDS

* minor naming changed

* gtensor gds support

* update doc & test
Support construct TensorSharedHandle directly from CUDA IPC Handle
* fix benchmark

* fix incorrect MatchResult

* use int64_t for offset

* fix bugs && update docs

* update config file

* fix env name

* remove useless exceptions
* prevent automatically initializing MPI

* disable auto-mpi-init

* Init support for TensorRT-LLM

* add scripts

* fix import and interface

* support the trtllm gpu layout and improve register api of trt_adapter

* modify log

* modify scripts

* use remote transfermanager

* some fix by hulin

* using subprocess instead of multiprocessing

* fix dead lock

* fix some bugs about gpu_register_port

* fix tensor export

* fix head_size

* fix num_kv_heads for deepseek

* fix ipc open error

* fix head_size calculationg error

* fix interface

* fix get num_matched_tokens from trtllm

* fix head_size calculationg error

* fix interface

* fix short len

* remove code

* add patch file

* modify scripts

* tensorRT LLM will wait until kvmanager isready

* [bugfix] fix token alignment issue in tensorrt-llm by rounding down to block size

* trivial

* support flexkv + cuda graph using flexkv

* modify patch

* modify scripts

* [bugfix] fix some bug

fix bug

* fix redix_tree

* modify scripts

* add debug log

* modify scripts

* fix rebase error

* fix radix tree

* fix scripts

* use new config

* rename

* fix script

* add branch for calculation of aligned_length

* add branch for remote_process

* take another way to determine branch

* fux scripts

* remove useless env and config

* remove useless commit

---------

Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com>
Co-authored-by: linhu-nv <linhu@nvidia.com>
Co-authored-by: Luis-xu <hfutxjn@163.com>
Co-authored-by: annz <annz@nvidia.com>
Co-authored-by: leolingli <leolingli@tencent.com>
* [bugfix] fix for deepseek head number wrong

* [bugfix] fix bug, if cpu match len is bigger than ssd when put it will cause error

* fix redix_tree (#39)

* fix empty

---------

Co-authored-by: leolingli <leolingli@tencent.com>
* add patch

* init docs

* fin readme

* rename yml

* fix readme

* fix readme

* update docs

* fix docs

* fix docs

* fix docs

* add title

* add readme_en

* Update docs/trtllm_adaption/README_en.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix bug for tp16

* update vllm adapter to support tp16
@linhu-nv linhu-nv requested review from linhu-nv and zhuofan1123 and removed request for linhu-nv November 25, 2025 08:00
@peaceforeverCN peaceforeverCN merged commit 33dd38b into main Nov 25, 2025
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.