[WIP][Bugfix] Fix xgrammar nanobind leaked objects at shutdown by haosdent · Pull Request #34690 · vllm-project/vllm

haosdent · 2026-02-17T11:34:25Z

Purpose

Fix nanobind leak warnings (GrammarMatcher, CompiledGrammar) when using xgrammar as the structured output backend. The warnings appear at process exit:

nanobind: leaked 2 instances!
 - leaked instance of type "GrammarMatcher"
 - leaked instance of type "CompiledGrammar"
nanobind: leaked 2 types!
nanobind: leaked 16 functions!

Root cause: Per-request xgrammar nanobind objects were not released during shutdown due to two gaps:

Scheduler.shutdown() did not clear its self.requests dict, so Request -> StructuredOutputRequest -> XgrammarGrammar (holding matcher and ctx nanobind objects) remained referenced until Python interpreter teardown, when nanobind metadata may already be freed.
LLMEngine.__del__() did not call engine_core.shutdown(), so in in-process mode (VLLM_ENABLE_V1_MULTIPROCESSING=0) the entire cleanup chain was never triggered. (Compare: AsyncLLM.__del__() already calls self.shutdown().)

Fix (3 targeted changes):

scheduler.py: Add self.requests.clear() in Scheduler.shutdown() to release per-request xgrammar objects deterministically during shutdown.
llm_engine.py: Add engine_core.shutdown() in LLMEngine.__del__() to ensure the cleanup chain runs in in-process mode (mirrors existing AsyncLLM pattern).
structured_output/__init__.py: Shut down ThreadPoolExecutors in clear_backend() to cancel pending grammar compilations, and set self.backend = None to make repeated calls idempotent.

Test Plan

Run the reproduction script from issue [Bug]: xgrammar cleanup leakage #26363 -- verify no nanobind leak warnings appear at exit.
Test both modes: default multiprocess and VLLM_ENABLE_V1_MULTIPROCESSING=0 (in-process).
Run existing unit tests:
- pytest tests/v1/core/test_scheduler.py
- pytest tests/v1/engine/test_llm_engine.py
- pytest tests/v1/structured_output/

Test Result

Reproduction script (default multiprocess mode):

Script completed successfully, generated correct structured JSON output.
No nanobind leak warnings (leaked instances, leaked types, leaked functions -- all absent).

Reproduction script (in-process mode, VLLM_ENABLE_V1_MULTIPROCESSING=0):

Script completed successfully, generated correct structured JSON output.
No nanobind leak warnings. Only a pre-existing PyTorch NCCL destroy_process_group warning remains (unrelated to xgrammar).

Unit tests:

tests/v1/structured_output/: 27 passed
tests/v1/core/test_scheduler.py: 86 passed, 1 skipped, 1 pre-existing failure (test_async_scheduling_pp_allows_rescheduling_with_output_placeholders -- fails on main as well)
tests/v1/engine/test_llm_engine.py: 4 passed, 1 pre-existing failure (GPU memory issue in test environment), 1 pre-existing failure (HuggingFace download issue in test environment)

…ct#26363) Fix nanobind leak warnings for GrammarMatcher and CompiledGrammar objects when using xgrammar as the structured output backend. The root cause was that per-request xgrammar objects were not released during shutdown: Scheduler.shutdown() did not clear its requests dict, and LLMEngine.__del__() did not trigger the shutdown chain at all in in-process mode. Signed-off-by: haosdent <haosdent@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a nanobind object leak at shutdown when using xgrammar. The changes introduce proper cleanup logic in several places. In scheduler.py, self.requests is cleared during shutdown to release references to per-request objects. In llm_engine.py, engine_core.shutdown() is now called from LLMEngine.__del__, ensuring the shutdown sequence is triggered in-process, consistent with AsyncLLM. Finally, in structured_output/__init__.py, ThreadPoolExecutors are properly shut down to prevent dangling resources. These changes seem correct and effectively address the reported memory leak. My only concern is the reliance on __del__ for cleanup in LLMEngine, which can be unreliable.

gemini-code-assist · 2026-02-17T11:36:03Z

vllm/v1/engine/llm_engine.py

+        if engine_core := getattr(self, "engine_core", None):
+            engine_core.shutdown()


Using __del__ for cleanup is unreliable. It's not guaranteed to be called, especially during interpreter shutdown or if reference cycles exist. This can lead to resource leaks, which this PR aims to fix. Consider providing an explicit shutdown() method on LLMEngine or implementing it as a context manager for more deterministic cleanup.

haosdent requested review from ApostaC, WoosukKwon, aarnphm, alexm-redhat, benchislett, heheda12345, mgoin, njhill, orozery, robertgshaw2-redhat, russellb and ywang96 as code owners February 17, 2026 11:34

mergify bot added structured-output v1 bug Something isn't working labels Feb 17, 2026

github-project-automation bot added this to Structured Output Feb 17, 2026

gemini-code-assist bot reviewed Feb 17, 2026

View reviewed changes

haosdent changed the title ~~[Bugfix] Fix xgrammar nanobind leaked objects at shutdown~~ [WIP][Bugfix] Fix xgrammar nanobind leaked objects at shutdown Feb 17, 2026

haosdent mentioned this pull request Feb 17, 2026

[Bug]: xgrammar cleanup leakage #26363

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][Bugfix] Fix xgrammar nanobind leaked objects at shutdown#34690

[WIP][Bugfix] Fix xgrammar nanobind leaked objects at shutdown#34690
haosdent wants to merge 1 commit intovllm-project:mainfrom
haosdent:fix-26363

haosdent commented Feb 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if engine_core := getattr(self, "engine_core", None):
		engine_core.shutdown()

Uh oh!

Conversation

haosdent commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

haosdent commented Feb 17, 2026 •

edited

Loading