Merged
Conversation
[bugfix] quick fix of match_prefix
Signed-off-by: Zhaohu Xing <x.zhaohu@gmail.com>
[misc] Replace std::map with std::unordered_map in RadixTree
* add global config from env * use config from env * simplify port config * remove max_req_tokens * simple user config * update flexkv_config * fix benchmark * remove unused example * modify config doc * fix iouring flag && allow user config override env * update all docs * rename layout type * small fix * update tracer * adjust ssd blocks num if necessary * fix broken tests --------- Co-authored-by: linhu-nv <linhu@nvidia.com>
Co-authored-by: jianyingzhu <joeyzhu@nvidia.com>
* [refactor] gds reuse ssd handle * refactor tpGDS * minor naming changed * gtensor gds support * update doc & test
Support construct TensorSharedHandle directly from CUDA IPC Handle
add scripts for vllm adapter
* fix benchmark * fix incorrect MatchResult * use int64_t for offset * fix bugs && update docs * update config file * fix env name * remove useless exceptions
* prevent automatically initializing MPI * disable auto-mpi-init * Init support for TensorRT-LLM * add scripts * fix import and interface * support the trtllm gpu layout and improve register api of trt_adapter * modify log * modify scripts * use remote transfermanager * some fix by hulin * using subprocess instead of multiprocessing * fix dead lock * fix some bugs about gpu_register_port * fix tensor export * fix head_size * fix num_kv_heads for deepseek * fix ipc open error * fix head_size calculationg error * fix interface * fix get num_matched_tokens from trtllm * fix head_size calculationg error * fix interface * fix short len * remove code * add patch file * modify scripts * tensorRT LLM will wait until kvmanager isready * [bugfix] fix token alignment issue in tensorrt-llm by rounding down to block size * trivial * support flexkv + cuda graph using flexkv * modify patch * modify scripts * [bugfix] fix some bug fix bug * fix redix_tree * modify scripts * add debug log * modify scripts * fix rebase error * fix radix tree * fix scripts * use new config * rename * fix script * add branch for calculation of aligned_length * add branch for remote_process * take another way to determine branch * fux scripts * remove useless env and config * remove useless commit --------- Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com> Co-authored-by: linhu-nv <linhu@nvidia.com> Co-authored-by: Luis-xu <hfutxjn@163.com> Co-authored-by: annz <annz@nvidia.com> Co-authored-by: leolingli <leolingli@tencent.com>
* [bugfix] fix for deepseek head number wrong * [bugfix] fix bug, if cpu match len is bigger than ssd when put it will cause error * fix redix_tree (#39) * fix empty --------- Co-authored-by: leolingli <leolingli@tencent.com>
* add patch * init docs * fin readme * rename yml * fix readme * fix readme * update docs * fix docs * fix docs * fix docs * add title * add readme_en * Update docs/trtllm_adaption/README_en.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: zhuofan1123 <zhuofanl@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix bug for tp16 * update vllm adapter to support tp16
linhu-nv
approved these changes
Nov 25, 2025
zhuofan1123
approved these changes
Nov 25, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.