-
-
Notifications
You must be signed in to change notification settings - Fork 9.1k
Description
Your current environment
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 11 (bullseye) (x86_64)
GCC version: (Debian 10.2.1-6) 10.2.1 20210110
Clang version: Could not collect
CMake version: version 3.18.4
Libc version: glibc-2.31
Python version: 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (64-bit runtime)
Python platform: Linux-5.4.143.bsk.7-amd64-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: NVIDIA A800-SXM4-80GB
GPU 1: NVIDIA A800-SXM4-80GB
GPU 2: NVIDIA A800-SXM4-80GB
GPU 3: NVIDIA A800-SXM4-80GB
GPU 4: NVIDIA A800-SXM4-80GB
GPU 5: NVIDIA A800-SXM4-80GB
GPU 6: NVIDIA A800-SXM4-80GB
GPU 7: NVIDIA A800-SXM4-80GB
Nvidia driver version: 535.161.08
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 57 bits virtual
CPU(s): 120
On-line CPU(s) list: 0-119
Thread(s) per core: 2
Core(s) per socket: 30
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8336C CPU @ 2.30GHz
Stepping: 6
CPU MHz: 2294.608
BogoMIPS: 4589.21
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 2.8 MiB
L1i cache: 1.9 MiB
L2 cache: 75 MiB
L3 cache: 108 MiB
NUMA node0 CPU(s): 0-59
NUMA node1 CPU(s): 60-119
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid md_clear arch_capabilities
Versions of relevant libraries:
[pip3] byted-torch==2.1.0.post2
[pip3] byted-torch-monitor==0.0.1
[pip3] numpy==1.26.2
[pip3] nvidia-cublas-cu12==12.1.3.1
[pip3] nvidia-cuda-cupti-cu12==12.1.105
[pip3] nvidia-cuda-nvrtc-cu12==12.1.105
[pip3] nvidia-cuda-runtime-cu12==12.1.105
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu12==11.0.2.54
[pip3] nvidia-curand-cu12==10.3.2.106
[pip3] nvidia-cusolver-cu12==11.4.5.107
[pip3] nvidia-cusparse-cu12==12.1.0.106
[pip3] nvidia-ml-py==12.560.30
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] nvidia-nvjitlink-cu12==12.6.68
[pip3] nvidia-nvtx-cu12==12.1.105
[pip3] pyzmq==26.2.0
[pip3] torch==2.4.0
[pip3] torchaudio==2.4.0
[pip3] torchvision==0.19.0
[pip3] transformers==4.45.1
[pip3] triton==3.0.0
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.6.0@32e7db25365415841ebc7c4215851743fbb1bad1
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV8 NV8 NV8 NV8 NV8 NV8 NV8 SYS PIX PIX NODE NODE SYS SYS SYS SYS 0-59 0N/A
GPU1 NV8 X NV8 NV8 NV8 NV8 NV8 NV8 SYS PIX PIX NODE NODE SYS SYS SYS SYS 0-59 0N/A
GPU2 NV8 NV8 X NV8 NV8 NV8 NV8 NV8 SYS NODE NODE PIX PIX SYS SYS SYS SYS 0-59 0N/A
GPU3 NV8 NV8 NV8 X NV8 NV8 NV8 NV8 SYS NODE NODE PIX PIX SYS SYS SYS SYS 0-59 0N/A
GPU4 NV8 NV8 NV8 NV8 X NV8 NV8 NV8 SYS SYS SYS SYS SYS PIX PIX NODE NODE 60-119 1N/A
GPU5 NV8 NV8 NV8 NV8 NV8 X NV8 NV8 SYS SYS SYS SYS SYS PIX PIX NODE NODE 60-119 1N/A
GPU6 NV8 NV8 NV8 NV8 NV8 NV8 X NV8 SYS SYS SYS SYS SYS NODE NODE PIX PIX 60-119 1N/A
GPU7 NV8 NV8 NV8 NV8 NV8 NV8 NV8 X SYS SYS SYS SYS SYS NODE NODE PIX PIX 60-119 1N/A
NIC0 SYS SYS SYS SYS SYS SYS SYS SYS X SYS SYS SYS SYS SYS SYS SYS SYS
NIC1 PIX PIX NODE NODE SYS SYS SYS SYS SYS X PIX NODE NODE SYS SYS SYS SYS
NIC2 PIX PIX NODE NODE SYS SYS SYS SYS SYS PIX X NODE NODE SYS SYS SYS SYS
NIC3 NODE NODE PIX PIX SYS SYS SYS SYS SYS NODE NODE X PIX SYS SYS SYS SYS
NIC4 NODE NODE PIX PIX SYS SYS SYS SYS SYS NODE NODE PIX X SYS SYS SYS SYS
NIC5 SYS SYS SYS SYS PIX PIX NODE NODE SYS SYS SYS SYS SYS X PIX NODE NODE
NIC6 SYS SYS SYS SYS PIX PIX NODE NODE SYS SYS SYS SYS SYS PIX X NODE NODE
NIC7 SYS SYS SYS SYS NODE NODE PIX PIX SYS SYS SYS SYS SYS NODE NODE X PIX
NIC8 SYS SYS SYS SYS NODE NODE PIX PIX SYS SYS SYS SYS SYS NODE NODE PIX X
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
NIC Legend:
NIC0: mlx5_0
NIC1: mlx5_1
NIC2: mlx5_2
NIC3: mlx5_3
NIC4: mlx5_4
NIC5: mlx5_5
NIC6: mlx5_6
NIC7: mlx5_7
NIC8: mlx5_8
Model Input Dumps
No response
🐛 Describe the bug
INFO 09-29 11:48:11 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
INFO 09-29 11:48:12 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method, Traceback (most recent call last):
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] File "/home/tiger/.local/lib/python3.9/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] output = executor(*args, **kwargs)
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] File "/home/tiger/.local/lib/python3.9/site-packages/vllm/worker/worker.py", line 166, in init_device
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] torch.cuda.set_device(self.device)
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] File "/home/tiger/.local/lib/python3.9/site-packages/torch/cuda/init.py", line 420, in set_device
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] torch._C._cuda_setDevice(device)
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] File "/home/tiger/.local/lib/python3.9/site-packages/torch/cuda/init.py", line 300, in _lazy_init
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] raise RuntimeError(
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
ERROR 09-29 11:48:12 multiproc_worker_utils.py:226]
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.