[V0][Bugfix] Fix Mamba cache crashing #15296

benchislett · 2025-03-21T16:34:28Z

When a request is finished but the scheduler has no more requests, the finished_req_ids will be cleared from the scheduler but the model execution will be skipped. This means that the Mamba cache cannot free the slots for those requests, leading to a slow buildup of unavailable slots until there are none left.

This PR changes the behaviour of the LLM engine to only clear the finished_req_ids when there are scheduled requests to process.

FIX #13129

Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>

github-actions · 2025-03-21T16:34:36Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

youkaichao · 2025-03-22T05:51:46Z

@tlrmchlsmth is the expert on mamba

tlrmchlsmth

This looks good. It looks like the fix is already in async_llm_engine.py but not llm_engine.py, so please merge in latest main. (And please make sure the code is equivalent in the two files as well). Thank you!

mergify · 2025-04-11T13:58:27Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @benchislett.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>

benchislett · 2025-04-11T14:21:39Z

@tlrmchlsmth conflict resolved, diff looks good, ready to merge.

sssrijan-amazon · 2025-05-28T00:05:55Z

Is this change not going to be merged? Any updates?

tlrmchlsmth

@benchislett could you merge in latest main?

fix mamba crash due to finished req ids handling

4a8e39f

Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>

benchislett requested review from zhuohan123, youkaichao, alexm-redhat, comaniac and njhill as code owners March 21, 2025 16:34

youkaichao requested a review from tlrmchlsmth March 22, 2025 05:51

tlrmchlsmth reviewed Apr 11, 2025

View reviewed changes

mergify bot added the needs-rebase label Apr 11, 2025

Merge branch 'main' into patch-mamba-cache-slots

99995b0

Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>

mergify bot removed the needs-rebase label Apr 11, 2025

mgoin added bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v0 labels Apr 11, 2025

tlrmchlsmth approved these changes May 28, 2025

View reviewed changes

Merge branch 'main' into patch-mamba-cache-slots

4a90934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[V0][Bugfix] Fix Mamba cache crashing #15296

[V0][Bugfix] Fix Mamba cache crashing #15296

benchislett commented Mar 21, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 21, 2025

Uh oh!

youkaichao commented Mar 22, 2025

Uh oh!

tlrmchlsmth left a comment

Uh oh!

mergify bot commented Apr 11, 2025

Uh oh!

benchislett commented Apr 11, 2025

Uh oh!

sssrijan-amazon commented May 28, 2025

Uh oh!

tlrmchlsmth left a comment

Uh oh!

Uh oh!

Uh oh!

[V0][Bugfix] Fix Mamba cache crashing #15296

Are you sure you want to change the base?

[V0][Bugfix] Fix Mamba cache crashing #15296

Conversation

benchislett commented Mar 21, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 21, 2025

Uh oh!

youkaichao commented Mar 22, 2025

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Apr 11, 2025

Uh oh!

benchislett commented Apr 11, 2025

Uh oh!

sssrijan-amazon commented May 28, 2025

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benchislett commented Mar 21, 2025 •

edited by github-actions bot

Loading