Update vLLM to 0.10.1 #142

blefo · 2025-08-21T08:12:20Z

No description provided.

…t optimized)

jcabrero · 2025-08-22T07:30:31Z

nilai-models/pyproject.toml

 dependencies = [
    "httpx>=0.27.2",
    "nilai-common",
+    "vllm>=0.10.1",


Is this necessary? vLLM should be installed in the Docker container, as we're using vllm-openai as base image right? vLLM is a heavy dependency, if not needed, I would remove it.

Not needed, I've removed it

jcabrero · 2025-08-22T07:53:26Z

tests/e2e/test_openai.py

+            # Convert to dict to access the data safely
+            first_call_dict = first_call.model_dump()
+
+            # Extract function name and arguments from the tool call
+            if "function" in first_call_dict:
+                function_name = first_call_dict["function"]["name"]
+                function_args = first_call_dict["function"]["arguments"]
+            else:
+                # Fallback for different structure
+                function_name = first_call_dict.get("name", "")
+                function_args = first_call_dict.get("arguments", "")
+
+            assert function_name == "get_weather", "Function name should be get_weather"


Just out of curiosity. Why is this change necessary? Is there any fundamental change in how vLLM behaves now that makes this not work any longer?

vLLM ≥0.10 aligned its tool-calling objects with OpenAI’s tool_calls[].function={name,arguments} schema. In 0.7.x the same info was exposed flat as {name, arguments} on the tool call

I have removed the condition as we only use the vLLM ≥0.10

jcabrero

Check https://docs.vllm.ai/en/latest/configuration/engine_args.html#-max-num-batched-tokens.

It is wrongly specified. You write max_num_batched_tokens and the parameter supported by vLLM is max-num-batched-tokens. Thus, making no effective change.
I am not sure we need it, as by default it is None and set to the usage context (i.e., max-model-len).
I understand Llama 1B GPU file where no max-model-len is specified to have this parameter.

jcabrero

Great job 🥇

I think once you merge one of the two PRs it's going to conflict with the changes to docker/api.Dockerfile, but should be just about dropping the changes in the second PR to be merged.

blefo added 3 commits August 21, 2025 10:10

Update vLLM to 0.10.1

53ee782

fix: vLLM update test compatibility to new OpenAI client structure

f9daa76

fix: added max_num_batched_tokens parameter needed in vLLM 0.10.1 (no…

b985220

…t optimized)

blefo force-pushed the feat-update-vllm-0.10.1 branch from fb1e1a3 to b985220 Compare August 21, 2025 11:01

blefo requested a review from jcabrero August 21, 2025 11:27

jcabrero reviewed Aug 22, 2025

View reviewed changes

jcabrero requested changes Aug 22, 2025

View reviewed changes

blefo added 2 commits August 22, 2025 12:23

fix: standardize max_num_batched_tokens parameter naming

6a6ab59

fix: simplify function call validation in e2e tests

87bbd12

blefo requested a review from jcabrero August 22, 2025 11:06

jcabrero approved these changes Aug 22, 2025

View reviewed changes

blefo merged commit f8ff93d into main Aug 22, 2025
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update vLLM to 0.10.1 #142

Update vLLM to 0.10.1 #142

Uh oh!

blefo commented Aug 21, 2025

Uh oh!

jcabrero Aug 22, 2025

Uh oh!

blefo Aug 22, 2025

Uh oh!

jcabrero Aug 22, 2025

Uh oh!

blefo Aug 22, 2025 •

edited

Loading

Uh oh!

jcabrero left a comment

Uh oh!

jcabrero left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Update vLLM to 0.10.1 #142

Update vLLM to 0.10.1 #142

Uh oh!

Conversation

blefo commented Aug 21, 2025

Uh oh!

jcabrero Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

blefo Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

jcabrero Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

blefo Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcabrero left a comment

Choose a reason for hiding this comment

Uh oh!

jcabrero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

blefo Aug 22, 2025 •

edited

Loading