Open
Description
Is your feature request related to a problem? Please describe.
vllm and llama-server both support min_p
(and more) as a valid field of the ChatCompletionRequest, this library only has TopP
.
Describe the solution you'd like
Add MinP and other missing fields from https://github.com/ggml-org/llama.cpp/blob/34b7c0439ed0f98575cc4689dfecd98991dee8be/tools/server/server.cpp#L254-L278
Additional context
vllm-project/vllm#2287
Some models such as MN-12B-Mag-Mell-R1 recommend min_p to be set to a crazy value like 0.2
(iirc it is 0.05 by default)