GitHub · Where software is built

Labels Milestones New issue

[Question] How to use OPTION_CHAT_TEMPLATE and OPTION_TOOL_PARSER_PLUGIN.

#2860

· kanno-go opened

on Jul 30, 2025

Server crashes when using TensorRT-LLM engine

#2859

· eduardzl opened

on Jul 27, 2025

The WLM central Maven repository is too old

#2856

· icode opened

on Jul 11, 2025

Difference in output between automatic and named function calls

#2851

· FlowlionAI opened

on Jun 25, 2025

Publish consistent information about the release versions and artifacts

#2845

· mle-idealo opened

on Jun 17, 2025

vllm_async_service: Inject custom output formatters into VLLMHandler

#2819

· CoolFish88 opened

on Apr 29, 2025

Serve v0.31.0 SNAPSHOT Crashes with Heap Space Errors in Docker

#2786

· Null1515 opened

on Mar 31, 2025

AWS SSO support

#2771

· CoolFish88 opened

on Mar 20, 2025

How to pass SamplingParameters to vllm backend when using the chat completion API

#2770

· massi-ang opened

on Mar 19, 2025

LMI Multinode: Wrong value of CUDA_VISIBLE_DEVICES when the system tries to launch multiple gpu.maxWorkers

#2740

· HappyAmazonian opened

on Feb 14, 2025

awscurl - totalTimeMills metric

#2713

· CoolFish88 opened

on Feb 4, 2025

DJL running without speculative decoding

#2678

· eduardzl opened

on Jan 24, 2025