Skip to content

[BUG] KVBM: PdConnector: enable_cross_layers_blocks causes NIXL hash mismatch #10027

@flpanbin

Description

@flpanbin

Describe the Bug

When using PdConnector (MultiConnector with DynamoConnector + NixlConnector) on the prefill side and a standalone NixlConnector with "enable_cross_layers_blocks": true on the decode side, vLLM throws a NIXL compatibility hash mismatch error during the KV transfer handshake.

This asymmetry causes the compatibility hash check to fail, even though both instances use the same model, vLLM version, TP size.

Steps to Reproduce

  1. Deploy prefill with PdConnector:
    {
      "kv_connector": "PdConnector",
      "kv_role": "kv_both",
      "kv_connector_module_path": "kvbm.vllm_integration.connector",
      "kv_connector_extra_config": {
        "connectors": [
          {
            "kv_connector": "DynamoConnector",
            "kv_role": "kv_both",
            "kv_connector_module_path": "kvbm.vllm_integration.connector"
          },
          {
            "kv_connector": "NixlConnector",
            "kv_role": "kv_both",
            "kv_connector_extra_config": {
              "enable_cross_layers_blocks": true
            }
          }
        ]
      }
    }
  2. Deploy decode with standalone NixlConnector:
{
  "kv_connector": "NixlConnector",
  "kv_role": "kv_both",
  "kv_connector_extra_config": {
    "enable_cross_layers_blocks": true
  }
}
  1. Send an inference request that triggers PD disaggregation.
  2. Decode worker crashes with:
    RuntimeError: NIXL compatibility hash mismatch. Local: ..., Remote: ... Prefill and decode instances have incompatible configurations.

Expected Behavior

  • Option A: DynamoConnector should support prefer_cross_layer_blocks = true (or have a configurable option) so that PdConnector can correctly propagate cross-layer block preference when all sub-connectors are configured to support it.

  • Option B: If DynamoConnector inherently cannot support cross-layer blocks, the limitation should be explicitly documented so users know not to set enable_cross_layers_blocks on either side when using PdConnector.

Actual Behavior

DynamoConnector inherits the default KVConnectorBase_V1.prefer_cross_layer_blocks which returns False.
MultiConnector.prefer_cross_layer_blocks returns all(c.prefer_cross_layer_blocks for c in self._connectors), so PdConnector evaluates to False regardless of the inner NixlConnector's configuration.
This forces prefill to use a normal KV cache layout while decode uses a cross-layer layout, leading to incompatible NIXL compatibility hashes and a runtime crash.

Environment

vLLM version: v0.19.0
KVBM:1.1.1

Additional Context

No response

Screenshots

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    backend::vllmRelates to the vllm backendbugSomething isn't workingdynamo-runtimeRelates to the dynamo-runtime componentkvbmlanguage::pythonIssues/PRs that reference Python code

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions