Skip to content

[Bugfix] Fix faulty triton importing logic when using Ray for DP #19734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion vllm/triton_utils/importing.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

import os
import types
from importlib.util import find_spec

Expand All @@ -23,7 +24,22 @@
x.driver for x in backends.values()
if x.driver and x.driver.is_active()
]
if len(active_drivers) != 1:

# Check if we're in a distributed environment where CUDA_VISIBLE_DEVICES
# might be temporarily empty (e.g., Ray sets it to "" during actor init)
cuda_visible_devices = os.environ.get("CUDA_VISIBLE_DEVICES")
is_distributed_env = (cuda_visible_devices is not None
and len(cuda_visible_devices.strip()) == 0)
Comment on lines +31 to +32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like is_ray_env be better here? Since presumably this will be False in non-ray distributed cases?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, an alternative is to check Ray directly:

def is_in_ray_actor():


# Apply lenient driver check for distributed environments
if is_distributed_env and len(active_drivers) == 0:
# Allow 0 drivers in distributed environments - they may become
# active later when CUDA context is properly initialized
logger.debug(
"Triton found 0 active drivers in distributed environment. "
"This is expected during initialization.")
elif not is_distributed_env and len(active_drivers) != 1:
# Strict check for non-distributed environments
logger.info(
"Triton is installed but %d active driver(s) found "
"(expected 1). Disabling Triton to prevent runtime errors.",
Expand Down