Skip to content

Conversation

@southfreebird
Copy link
Contributor

@southfreebird southfreebird commented Sep 30, 2025

Purpose

Adding the frequency_penalty, presence_penalty, and repetition_penalty sampling parameters to the serve tool. It allows enabling them for performance measurement.

Test Plan

Example for frequency_penalty:

vllm bench serve --model $MODEL_NAME --dataset-name sharegpt --num-prompts 200 --dataset-path $DATASET_PATH --seed 0 --frequency-penalty 1.0

Test Result

@mergify mergify bot added the performance Performance-related issues label Sep 30, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for frequency_penalty, presence_penalty, and repetition_penalty sampling parameters to the vllm bench serve tool. The implementation is correct in passing these parameters to the backend. However, there is a lack of client-side validation for the values of these new parameters. I've added a comment to suggest adding validation to improve user experience and prevent benchmark failures due to invalid inputs.

Comment on lines +1104 to +1100
sampling_group.add_argument(
"--frequency-penalty",
type=float,
default=None,
help="Frequency penalty sampling parameter. Only has effect on "
"openai-compatible backends.",
)
sampling_group.add_argument(
"--presence-penalty",
type=float,
default=None,
help="Presence penalty sampling parameter. Only has effect on "
"openai-compatible backends.",
)
sampling_group.add_argument(
"--repetition-penalty",
type=float,
default=None,
help="Repetition penalty sampling parameter. Only has effect on "
"openai-compatible backends.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The newly added penalty parameters (frequency_penalty, presence_penalty, repetition_penalty) are parsed as floats without any validation. The OpenAI API, and vLLM's implementation of it, has specific valid ranges for these parameters:

  • frequency_penalty: between -2.0 and 2.0.
  • presence_penalty: between -2.0 and 2.0.
  • repetition_penalty: must be a positive float.

Passing values outside these ranges will cause requests to fail at the server level, which could be confusing for users running benchmarks. It would be better to add client-side validation for these parameters to provide immediate and clear feedback on invalid inputs. This could be done using a custom type function with argparse.

@simon-mo simon-mo enabled auto-merge (squash) September 30, 2025 22:36
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 30, 2025
auto-merge was automatically disabled October 1, 2025 09:32

Head branch was pushed to by a user without write access

@southfreebird southfreebird force-pushed the feature/add-frequency-penalties-to-serve branch 2 times, most recently from a9082b2 to fe78868 Compare October 1, 2025 14:48
Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
@southfreebird southfreebird force-pushed the feature/add-frequency-penalties-to-serve branch from fe78868 to 7199ec1 Compare October 1, 2025 19:03
@simon-mo simon-mo merged commit b71fcd4 into vllm-project:main Oct 3, 2025
46 checks passed
tomeras91 pushed a commit to tomeras91/vllm that referenced this pull request Oct 6, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
karan pushed a commit to karan/vllm that referenced this pull request Oct 6, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com>
southfreebird added a commit to southfreebird/vllm that referenced this pull request Oct 7, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
sducouedic pushed a commit to sducouedic/vllm that referenced this pull request Oct 16, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…25974)

Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants