-
Notifications
You must be signed in to change notification settings - Fork 278
Insights: ModelTC/LightLLM
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v1.1.0 LightLLM v1.1.0 Release!
published
Sep 3, 2025
34 Pull requests merged by 9 people
-
vit fa3 api fix
#1047 merged
Sep 8, 2025 -
[fix]fix fp8 bug when load moe model
#1045 merged
Sep 8, 2025 -
add stream_options for openai api
#1046 merged
Sep 8, 2025 -
fix mtp mem alloc in overlap manner
#1044 merged
Sep 8, 2025 -
force to warmup triton autotune configs in start.
#1043 merged
Sep 5, 2025 -
fix tl.where warning
#1041 merged
Sep 4, 2025 -
v100 triton kernel fix
#1040 merged
Sep 3, 2025 -
LightLLM v1.1.0 release!
#1039 merged
Sep 3, 2025 -
add qwen235b autotune config
#1038 merged
Sep 3, 2025 -
fix autotune and benchmark
#1037 merged
Sep 3, 2025 -
group deepgemm update api
#1035 merged
Sep 3, 2025 -
fix benchmark
#1033 merged
Sep 3, 2025 -
tuning optimization
#1032 merged
Sep 2, 2025 -
Add setproctitle
#1024 merged
Sep 2, 2025 -
add AutotuneLevel for more detailed autotune
#1031 merged
Sep 1, 2025 -
fix autotuning warmup length
#1028 merged
Aug 29, 2025 -
Autotuner
#1020 merged
Aug 28, 2025 -
fix input_penalty token_id async update bug.
#1022 merged
Aug 25, 2025 -
Dp balancer
#991 merged
Aug 25, 2025 -
fix check_recommended_shm_size
#1021 merged
Aug 22, 2025 -
add greedy_sample
#1019 merged
Aug 22, 2025 -
fix mtp static bench
#1009 merged
Aug 21, 2025 -
support more PD node select func. such as random or roundrobin.
#1018 merged
Aug 21, 2025 -
feat: support more PD node select func
#970 merged
Aug 21, 2025 -
Add multimodal token usage
#1016 merged
Aug 21, 2025 -
Add multimodal token usage
#1011 merged
Aug 21, 2025 -
feat: add stop string matching
#969 merged
Aug 20, 2025 -
Fix dynamic_prompt_cache for chunked prefill
#1010 merged
Aug 20, 2025 -
deepseek && qwen tp performance tuning
#934 merged
Aug 20, 2025 -
Fix the overflow issue caused by the mem index type being int32 in the decode att operator.
#1013 merged
Aug 20, 2025 -
[Misc] Add a progress bar when loading the model
#1008 merged
Aug 20, 2025 -
Add shm size check
#978 merged
Aug 18, 2025 -
[opt]opti-qwen2-vl-vit
#1004 merged
Aug 14, 2025 -
Fix error illegal memory access when max_total_token_num is too large
#998 merged
Aug 12, 2025
7 Pull requests opened by 6 people
-
add fa3_mtp
#1005 opened
Aug 11, 2025 -
[support] vit and llm disaggregation
#1014 opened
Aug 20, 2025 -
Optimize multimodal resource allocation with concurrency and improved batch RPC
#1017 opened
Aug 21, 2025 -
Add Support For GPT-OSS Model
#1023 opened
Aug 27, 2025 -
Use environment variable for RMSNORM_WARPS
#1027 opened
Aug 29, 2025 -
Mineru adapt
#1034 opened
Sep 3, 2025 -
pd with nixl backend (rebase main)
#1042 opened
Sep 4, 2025
4 Issues closed by 4 people
-
[BUG]LLVM ERROR: Failed to compute parent layout for slice layout.
#1030 closed
Sep 3, 2025 -
V100 has compute capability sm70 while Feature 'cvt.bf16.f32' requires .target sm_80 or higher
#1029 closed
Aug 30, 2025 -
Question about tp
#1006 closed
Aug 22, 2025 -
[Feature] Openai GPT OSS Support
#1012 closed
Aug 20, 2025
1 Issue opened by 1 person
-
where can I find `lightllm_constraint_decode_kernel`?
#1007 opened
Aug 15, 2025
5 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Multimodal improve
#951 commented on
Aug 13, 2025 • 0 new comments -
Fp8 deepseek
#975 commented on
Sep 4, 2025 • 0 new comments -
Asynchicache
#977 commented on
Aug 22, 2025 • 0 new comments -
Disk cache and cpu Cache feature
#997 commented on
Aug 13, 2025 • 0 new comments -
Support Qwen models' dp>1 in PD
#999 commented on
Aug 28, 2025 • 0 new comments