V1 - dont look for bucket we know don't exists #1606

adobrzyn · 2025-07-16T12:29:25Z

No description provided.

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

adobrzyn · 2025-07-16T12:29:37Z

/run-gaudi-tests

Copilot

Pull Request Overview

This PR introduces an early exit in the 2D prompt bucketing logic when the batch size exceeds the maximum allowed, and updates the merge check to handle the new sentinel return values.

Added a pre-check in _bucketize_2d_prompt to return (None, None, None) for oversized batches.
Updated _can_merge_prefill_contents to treat any None in the bucketing result as a non-mergeable case.

Comments suppressed due to low confidence (2)

vllm/v1/worker/hpu_model_runner.py:969

[nitpick] Consider renaming the variable bs to batch_size for improved readability and to make its purpose immediately clear.

        if bs > self.max_prefill_batch_size:

vllm/v1/worker/hpu_model_runner.py:969

Add a unit test to verify that _bucketize_2d_prompt returns (None, None, None) when the batch size exceeds max_prefill_batch_size, ensuring this new branch is covered.

        if bs > self.max_prefill_batch_size:

vllm/v1/worker/hpu_model_runner.py

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

adobrzyn · 2025-07-16T14:21:29Z

/run-gaudi-tests

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

ripped from: HabanaAI/vllm-fork#1606, fixes weird bucketing anomaly where bs=1 prefills would be padded to bs=2 and trigger a recompilation Signed-off-by: Konrad Zawora <kzawora@habana.ai>

fix

6e3d05d

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

adobrzyn requested review from kzawora-intel, madamczyk-intel, michalkuligowski, mgawarkiewicz-intel, vivekgoe, afierka-intel, xuechendi, jikunshang, mswiniarsk and PatrykWo as code owners July 16, 2025 12:29

madamczyk-intel requested a review from Copilot July 16, 2025 12:33

madamczyk-intel approved these changes Jul 16, 2025

View reviewed changes

Copilot AI reviewed Jul 16, 2025

View reviewed changes

vllm/v1/worker/hpu_model_runner.py Outdated Show resolved Hide resolved

Change Nones to exception

97a55ca

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

adobrzyn merged commit 741e987 into habana_main Jul 16, 2025
53 checks passed

adobrzyn deleted the adobrzyn/dont_look_for_config_that_isnt_there branch July 16, 2025 15:45

adobrzyn added a commit that referenced this pull request Jul 16, 2025

V1 - dont look for bucket we know don't exists (#1606)

a1d5cfc

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

madamczyk-intel pushed a commit that referenced this pull request Jul 17, 2025

Port: V1 - dont look for bucket we know don't exists (#1606) (#1608)

37888b5

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

kzawora-intel mentioned this pull request Jul 17, 2025

Fix bs=2 prefill bucketing weirdness vllm-project/vllm-gaudi#35

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

V1 - dont look for bucket we know don't exists #1606

V1 - dont look for bucket we know don't exists #1606

Uh oh!

adobrzyn commented Jul 16, 2025

Uh oh!

adobrzyn commented Jul 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

adobrzyn commented Jul 16, 2025

Uh oh!

Uh oh!

Uh oh!

V1 - dont look for bucket we know don't exists #1606

V1 - dont look for bucket we know don't exists #1606

Uh oh!

Conversation

adobrzyn commented Jul 16, 2025

Uh oh!

adobrzyn commented Jul 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

adobrzyn commented Jul 16, 2025

Uh oh!

Uh oh!

Uh oh!