Skip to content

Error: resource already mapped in custom_all_reduce.cuh #2641

@yippp

Description

@yippp

This is a issue relared to #2619 (comment)

I have tried ray=2.9.1 with dev code in commit #2636

vllm.entrypoints.openai.api_server --model ./Mistral-7B-Instruct-v0.2-AWQ --quantization awq --dtype auto --host 0.0.0.0 --port 8081 --tensor-parallel-size 2
but I meet another error
Failed: Cuda error /home/my/vllm/csrc/custom_all_reduce.cuh:417 'resource already mapped' Segmentation fault (core dumped)

I am running with python=3.11, CUDA 12.1, driver 530 with 2x RTX 3090 NVLink.
When I rollback to commit #2622, the program works well. So it seems it is caused by #2192

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions