[WIP][Bugfix] Fix illegal memory access in causal_conv1d Triton kernels with CUDA graphs by haosdent · Pull Request #34685 · vllm-project/vllm

haosdent · 2026-02-17T10:06:07Z

Purpose

Fix illegal memory access (CUDA error) when running hybrid models (e.g., Qwen3.5-397B-A17B) with CUDA graphs enabled, particularly with data parallel + expert parallel configurations.

Root cause: The causal_conv1d Triton kernels use == pad_slot_id equality checks to detect padded CUDA graph entries. pad_slot_id is declared as tl.constexpr (a compile-time Python int -1), while the values loaded from tensors are cast to tl.int64 at runtime. This cross-type equality comparison can silently fail in Triton due to type promotion semantics. When the check fails, the kernel proceeds to use -1 (interpreted as 0xFFFFFFFFFFFFFFFF in unsigned int64) as a memory offset into conv_state, causing an out-of-bounds access.

Fix: Replace == pad_slot_id with < 0 at 3 locations in causal_conv1d.py. This is robust because valid slot/state indices are always non-negative, and PAD_SLOT_ID = -1 is the only negative sentinel value. This matches the pattern already used by the working fused_recurrent_gated_delta_rule_fwd_kernel.

Fixes #34619

Test Plan

Run existing causal_conv1d kernel tests covering update, update_with_batch_gather, and varlen scenarios.
Full reproduction of the original issue requires 8-GPU setup with -dp 8 --enable-expert-parallel and Qwen3.5-397B-A17B model.

Test Result

All 164 tests pass:

$ python -m pytest tests/kernels/mamba/test_causal_conv1d.py -v
======================= 164 passed, 3 warnings in 23.53s =======================

But I don't have 8-GPU to run Qwen3.5-397B-A17B

…th CUDA graphs Replace unreliable `== pad_slot_id` comparisons with robust `< 0` checks in causal_conv1d Triton kernels to prevent out-of-bounds memory access when CUDA graph padding introduces PAD_SLOT_ID (-1) entries. Fixes vllm-project#34619 Signed-off-by: haosdent <haosdent@gmail.com>

gemini-code-assist

Code Review

This pull request provides a fix for a critical illegal memory access bug in the causal_conv1d Triton kernels. The root cause is a subtle type comparison issue in Triton, which is well-documented in the description. The change from == pad_slot_id to < 0 is a robust and correct solution to this problem, ensuring that padded slots are handled correctly without risking out-of-bounds memory access. The fix is applied consistently and appears to be a solid improvement.

vadiklyutiy · 2026-02-17T10:55:14Z

When the check fails, the kernel proceeds to use -1 (interpreted as 0xFFFFFFFFFFFFFFFF in unsigned int64) as a memory offset into conv_state, causing an out-of-bounds access.

Curious, where unsigned int comes from? I see all types are int64 what is signed...

vadiklyutiy · 2026-02-17T11:40:57Z

with this PR I got the same illegal mem access :(

haosdent requested a review from tdoublep as a code owner February 17, 2026 10:06

mergify bot added nvidia bug Something isn't working labels Feb 17, 2026

github-project-automation bot added this to NVIDIA Feb 17, 2026

haosdent changed the title ~~[Bugfix] Fix illegal memory access in causal_conv1d Triton kernels with CUDA graphs~~ [WIP][Bugfix] Fix illegal memory access in causal_conv1d Triton kernels with CUDA graphs Feb 17, 2026

gemini-code-assist bot reviewed Feb 17, 2026

View reviewed changes

vadiklyutiy mentioned this pull request Feb 17, 2026

[Bug]: Qwen3.5. illegal memory access #34619

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][Bugfix] Fix illegal memory access in causal_conv1d Triton kernels with CUDA graphs#34685

[WIP][Bugfix] Fix illegal memory access in causal_conv1d Triton kernels with CUDA graphs#34685
haosdent wants to merge 1 commit intovllm-project:mainfrom
haosdent:fix-34619

haosdent commented Feb 17, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

vadiklyutiy commented Feb 17, 2026 •

edited

Loading

Uh oh!

vadiklyutiy commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

haosdent commented Feb 17, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

vadiklyutiy commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadiklyutiy commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haosdent commented Feb 17, 2026 •

edited by github-actions bot

Loading

vadiklyutiy commented Feb 17, 2026 •

edited

Loading