Skip to content

[Bug]: vllm serve hang at Using model weights format ['*.safetensors'] when using tp #6636

Closed
@coye01

Description

Your current environment

My script

export CUDA_VISIBLE_DEVICES=0,1,2,3
vllm serve meta-llama/Meta-Llama-3-70B-Instruct --tensor_parallel_size 4

🐛 Describe the bug

Here is the log, it seems that vllm serve stuck without raising any error information

INFO 07-22 07:59:44 api_server.py:212] vLLM API server version 0.5.2
INFO 07-22 07:59:44 api_server.py:213] args: Namespace(model_tag='meta-llama/Meta-Llama-3-70B-Instruct', host=None, port=8000, uvicorn_log_level='info', allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lora_modules=None, prompt_adapters=None, chat_template=None, response_role='assistant', ssl_keyfile=None, ssl_certfile=None, ssl_ca_certs=None, ssl_cert_reqs=0, root_path=None, middleware=[], model='meta-llama/Meta-Llama-3-70B-Instruct', tokenizer=None, skip_tokenizer_init=False, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=False, download_dir=None, load_format='auto', dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, max_model_len=None, guided_decoding_backend='outlines', distributed_executor_backend=None, worker_use_ray=False, pipeline_parallel_size=1, tensor_parallel_size=4, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=16, enable_prefix_caching=False, disable_sliding_window=False, use_v2_block_manager=False, num_lookahead_slots=0, seed=0, swap_space=4, gpu_memory_utilization=0.9, num_gpu_blocks_override=None, max_num_batched_tokens=None, max_num_seqs=256, max_logprobs=20, disable_log_stats=False, quantization=None, rope_scaling=None, rope_theta=None, enforce_eager=False, max_context_len_to_capture=None, max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, enable_lora=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', long_lora_scaling_factors=None, max_cpu_loras=None, fully_sharded_loras=False, enable_prompt_adapter=False, max_prompt_adapters=1, max_prompt_adapter_token=0, device='auto', scheduler_delay_factor=0.0, enable_chunked_prefill=False, speculative_model=None, num_speculative_tokens=None, speculative_draft_tensor_parallel_size=None, speculative_max_model_len=None, speculative_disable_by_batch_size=None, ngram_prompt_lookup_max=None, ngram_prompt_lookup_min=None, spec_decoding_acceptance_method='rejection_sampler', typical_acceptance_sampler_posterior_threshold=None, typical_acceptance_sampler_posterior_alpha=None, model_loader_extra_config=None, preemption_mode=None, served_model_name=None, qlora_adapter_name_or_path=None, otlp_traces_endpoint=None, engine_use_ray=False, disable_log_requests=False, max_log_len=None, dispatch_function=<function serve at 0x7f7ec4f3f5b0>)
INFO 07-22 07:59:44 config.py:695] Defaulting to use mp for distributed inference
INFO 07-22 07:59:44 llm_engine.py:174] Initializing an LLM engine (v0.5.2) with config: model='meta-llama/Meta-Llama-3-70B-Instruct', speculative_config=None, tokenizer='meta-llama/Meta-Llama-3-70B-Instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=4, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None), seed=0, served_model_name=meta-llama/Meta-Llama-3-70B-Instruct, use_v2_block_manager=False, enable_prefix_caching=False)
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
INFO 07-22 07:59:45 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
(VllmWorkerProcess pid=154783) WARNING 07-22 07:59:45 logger.py:146] VLLM_TRACE_FUNCTION is enabled. It will record every function executed by Python. This will slow down the code. It is suggested to be used for debugging hang or crashes only.
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:45 logger.py:150] Trace frame log is saved to /tmp/vllm/vllm-instance-48e7faf324c142f5809657cad97bec2e/VLLM_TRACE_FUNCTION_for_process_154783_thread_140188822037120_at_2024-07-22_07:59:45.306075.log
(VllmWorkerProcess pid=154784) WARNING 07-22 07:59:45 logger.py:146] VLLM_TRACE_FUNCTION is enabled. It will record every function executed by Python. This will slow down the code. It is suggested to be used for debugging hang or crashes only.
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:45 logger.py:150] Trace frame log is saved to /tmp/vllm/vllm-instance-48e7faf324c142f5809657cad97bec2e/VLLM_TRACE_FUNCTION_for_process_154784_thread_140188822037120_at_2024-07-22_07:59:45.312056.log
(VllmWorkerProcess pid=154785) WARNING 07-22 07:59:45 logger.py:146] VLLM_TRACE_FUNCTION is enabled. It will record every function executed by Python. This will slow down the code. It is suggested to be used for debugging hang or crashes only.
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:45 logger.py:150] Trace frame log is saved to /tmp/vllm/vllm-instance-48e7faf324c142f5809657cad97bec2e/VLLM_TRACE_FUNCTION_for_process_154785_thread_140188822037120_at_2024-07-22_07:59:45.319244.log
WARNING 07-22 07:59:45 logger.py:146] VLLM_TRACE_FUNCTION is enabled. It will record every function executed by Python. This will slow down the code. It is suggested to be used for debugging hang or crashes only.
INFO 07-22 07:59:45 logger.py:150] Trace frame log is saved to /tmp/vllm/vllm-instance-48e7faf324c142f5809657cad97bec2e/VLLM_TRACE_FUNCTION_for_process_154502_thread_140188822037120_at_2024-07-22_07:59:45.319590.log
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:47 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:47 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:47 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
DEBUG 07-22 07:59:47 parallel_state.py:803] world_size=4 rank=0 local_rank=0 distributed_init_method=tcp://127.0.0.1:41595 backend=nccl
(VllmWorkerProcess pid=154784) DEBUG 07-22 07:59:47 parallel_state.py:803] world_size=4 rank=2 local_rank=2 distributed_init_method=tcp://127.0.0.1:41595 backend=nccl
(VllmWorkerProcess pid=154785) DEBUG 07-22 07:59:47 parallel_state.py:803] world_size=4 rank=3 local_rank=3 distributed_init_method=tcp://127.0.0.1:41595 backend=nccl
(VllmWorkerProcess pid=154783) DEBUG 07-22 07:59:47 parallel_state.py:803] world_size=4 rank=1 local_rank=1 distributed_init_method=tcp://127.0.0.1:41595 backend=nccl
INFO 07-22 07:59:48 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:48 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:48 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:48 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:48 pynccl.py:63] vLLM is using nccl==2.20.5
INFO 07-22 07:59:48 pynccl.py:63] vLLM is using nccl==2.20.5
8a100-5:154502:154502 [0] NCCL INFO Bootstrap : Using ibP257s170556:192.168.1.11<0>
8a100-5:154502:154502 [0] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:48 utils.py:737] Found nccl from library libnccl.so.2
8a100-5:154785:154785 [3] NCCL INFO cudaDriverVersion 12040
8a100-5:154783:154783 [1] NCCL INFO cudaDriverVersion 12040
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:48 pynccl.py:63] vLLM is using nccl==2.20.5
8a100-5:154785:154785 [3] NCCL INFO Bootstrap : Using ibP257s170556:192.168.1.11<0>
8a100-5:154502:154502 [0] NCCL INFO cudaDriverVersion 12040
NCCL version 2.20.5+cuda12.4
8a100-5:154783:154783 [1] NCCL INFO Bootstrap : Using ibP257s170556:192.168.1.11<0>
8a100-5:154785:154785 [3] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation
8a100-5:154783:154783 [1] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation
8a100-5:154784:154784 [2] NCCL INFO cudaDriverVersion 12040
8a100-5:154784:154784 [2] NCCL INFO Bootstrap : Using ibP257s170556:192.168.1.11<0>
8a100-5:154784:154784 [2] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation
8a100-5:154785:154785 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_4:1/IB [5]mlx5_5:1/IB [6]mlx5_6:1/IB [7]mlx5_7:1/IB [8]mlx5_8:1/RoCE [RO]; OOB ibP257s170556:192.168.1.11<0>
8a100-5:154785:154785 [3] NCCL INFO Using non-device net plugin version 0
8a100-5:154785:154785 [3] NCCL INFO Using network IB
8a100-5:154783:154783 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_4:1/IB [5]mlx5_5:1/IB [6]mlx5_6:1/IB [7]mlx5_7:1/IB [8]mlx5_8:1/RoCE [RO]; OOB ibP257s170556:192.168.1.11<0>
8a100-5:154783:154783 [1] NCCL INFO Using non-device net plugin version 0
8a100-5:154783:154783 [1] NCCL INFO Using network IB
8a100-5:154502:154502 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_4:1/IB [5]mlx5_5:1/IB [6]mlx5_6:1/IB [7]mlx5_7:1/IB [8]mlx5_8:1/RoCE [RO]; OOB ibP257s170556:192.168.1.11<0>
8a100-5:154502:154502 [0] NCCL INFO Using non-device net plugin version 0
8a100-5:154502:154502 [0] NCCL INFO Using network IB
8a100-5:154784:154784 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_4:1/IB [5]mlx5_5:1/IB [6]mlx5_6:1/IB [7]mlx5_7:1/IB [8]mlx5_8:1/RoCE [RO]; OOB ibP257s170556:192.168.1.11<0>
8a100-5:154784:154784 [2] NCCL INFO Using non-device net plugin version 0
8a100-5:154784:154784 [2] NCCL INFO Using network IB
8a100-5:154784:154784 [2] NCCL INFO comm 0xc121020 rank 2 nranks 4 cudaDev 2 nvmlDev 2 busId 300000 commId 0x9ae34328ec058ff9 - Init START
8a100-5:154783:154783 [1] NCCL INFO comm 0xc121680 rank 1 nranks 4 cudaDev 1 nvmlDev 1 busId 200000 commId 0x9ae34328ec058ff9 - Init START
8a100-5:154502:154502 [0] NCCL INFO comm 0xc124fa0 rank 0 nranks 4 cudaDev 0 nvmlDev 0 busId 100000 commId 0x9ae34328ec058ff9 - Init START
8a100-5:154785:154785 [3] NCCL INFO comm 0xc120c40 rank 3 nranks 4 cudaDev 3 nvmlDev 3 busId 400000 commId 0x9ae34328ec058ff9 - Init START
8a100-5:154783:154783 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
8a100-5:154785:154785 [3] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
8a100-5:154784:154784 [2] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
8a100-5:154783:154783 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ff000000
8a100-5:154783:154783 [1] NCCL INFO NVLS multicast support is not available on dev 1
8a100-5:154502:154502 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
8a100-5:154785:154785 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
8a100-5:154785:154785 [3] NCCL INFO NVLS multicast support is not available on dev 3
8a100-5:154784:154784 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
8a100-5:154502:154502 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ff000000
8a100-5:154784:154784 [2] NCCL INFO NVLS multicast support is not available on dev 2
8a100-5:154502:154502 [0] NCCL INFO NVLS multicast support is not available on dev 0
8a100-5:154785:154785 [3] NCCL INFO comm 0xc120c40 rank 3 nRanks 4 nNodes 1 localRanks 4 localRank 3 MNNVL 0
8a100-5:154783:154783 [1] NCCL INFO comm 0xc121680 rank 1 nRanks 4 nNodes 1 localRanks 4 localRank 1 MNNVL 0
8a100-5:154784:154784 [2] NCCL INFO comm 0xc121020 rank 2 nRanks 4 nNodes 1 localRanks 4 localRank 2 MNNVL 0
8a100-5:154502:154502 [0] NCCL INFO comm 0xc124fa0 rank 0 nRanks 4 nNodes 1 localRanks 4 localRank 0 MNNVL 0
8a100-5:154783:154783 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 [16] 2/-1/-1->1->0 [17] 2/-1/-1->1->0 [18] 2/-1/-1->1->0 [19] 2/-1/-1->1->0 [20] 2/-1/-1->1->0 [21] 2/-1/-1->1->0 [22] 2/-1/-1->1->0 [23] 2/-1/-1->1->0
8a100-5:154502:154502 [0] NCCL INFO Channel 00/24 :    0   1   2   3
8a100-5:154783:154783 [1] NCCL INFO P2P Chunksize set to 524288
8a100-5:154502:154502 [0] NCCL INFO Channel 01/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 02/24 :    0   1   2   3
8a100-5:154785:154785 [3] NCCL INFO Trees [0] -1/-1/-1->3->2 [1] -1/-1/-1->3->2 [2] -1/-1/-1->3->2 [3] -1/-1/-1->3->2 [4] -1/-1/-1->3->2 [5] -1/-1/-1->3->2 [6] -1/-1/-1->3->2 [7] -1/-1/-1->3->2 [8] -1/-1/-1->3->2 [9] -1/-1/-1->3->2 [10] -1/-1/-1->3->2 [11] -1/-1/-1->3->2 [12] -1/-1/-1->3->2 [13] -1/-1/-1->3->2 [14] -1/-1/-1->3->2 [15] -1/-1/-1->3->2 [16] -1/-1/-1->3->2 [17] -1/-1/-1->3->2 [18] -1/-1/-1->3->2 [19] -1/-1/-1->3->2 [20] -1/-1/-1->3->2 [21] -1/-1/-1->3->2 [22] -1/-1/-1->3->2 [23] -1/-1/-1->3->2
8a100-5:154502:154502 [0] NCCL INFO Channel 03/24 :    0   1   2   3
8a100-5:154784:154784 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1 [8] 3/-1/-1->2->1 [9] 3/-1/-1->2->1 [10] 3/-1/-1->2->1 [11] 3/-1/-1->2->1 [12] 3/-1/-1->2->1 [13] 3/-1/-1->2->1 [14] 3/-1/-1->2->1 [15] 3/-1/-1->2->1 [16] 3/-1/-1->2->1 [17] 3/-1/-1->2->1 [18] 3/-1/-1->2->1 [19] 3/-1/-1->2->1 [20] 3/-1/-1->2->1 [21] 3/-1/-1->2->1 [22] 3/-1/-1->2->1 [23] 3/-1/-1->2->1
8a100-5:154785:154785 [3] NCCL INFO P2P Chunksize set to 524288
8a100-5:154502:154502 [0] NCCL INFO Channel 04/24 :    0   1   2   3
8a100-5:154784:154784 [2] NCCL INFO P2P Chunksize set to 524288
8a100-5:154502:154502 [0] NCCL INFO Channel 05/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 06/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 07/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 08/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 09/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 10/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 11/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 12/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 13/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 14/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 15/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 16/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 17/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 18/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 19/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 20/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 21/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 22/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Channel 23/24 :    0   1   2   3
8a100-5:154502:154502 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] 1/-1/-1->0->-1 [19] 1/-1/-1->0->-1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] 1/-1/-1->0->-1 [23] 1/-1/-1->0->-1
8a100-5:154502:154502 [0] NCCL INFO P2P Chunksize set to 524288
8a100-5:154784:154784 [2] NCCL INFO Channel 00/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 00/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 01/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 00/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 01/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 02/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 01/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 02/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 03/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 02/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 03/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 04/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 03/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 04/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 05/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 04/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 05/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 06/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 05/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 06/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 07/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 06/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 07/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 08/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 07/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 08/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 09/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 08/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 09/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 10/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 09/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 10/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 11/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 10/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 11/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 12/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 11/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 13/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 12/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 14/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 12/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 13/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 15/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 14/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 13/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 14/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 15/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 16/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 15/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 16/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 16/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 17/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 16/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 17/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 17/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 18/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 17/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 18/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 18/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 18/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 19/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 19/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 19/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 19/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 20/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 20/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 20/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 21/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 21/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 20/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 21/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 22/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 21/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 22/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 23/0 : 1[1] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 22/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 23/0 : 2[2] -> 3[3] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 23/0 : 3[3] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 22/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Channel 23/0 : 0[0] -> 1[1] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Connected all rings
8a100-5:154783:154783 [1] NCCL INFO Connected all rings
8a100-5:154502:154502 [0] NCCL INFO Connected all rings
8a100-5:154785:154785 [3] NCCL INFO Connected all rings
8a100-5:154785:154785 [3] NCCL INFO Channel 00/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 01/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 02/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 03/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 04/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 05/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 06/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 07/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 08/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 09/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 10/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 11/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 12/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 13/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 14/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 15/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 16/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 17/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 18/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 19/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 20/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 00/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 21/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 01/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 22/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 02/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154785:154785 [3] NCCL INFO Channel 23/0 : 3[3] -> 2[2] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 03/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 04/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 05/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 06/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 07/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 08/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 09/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 10/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 11/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 12/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 13/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 14/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 15/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 16/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 17/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 18/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 16/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 19/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 17/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 20/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 18/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 21/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 19/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 22/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 20/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154784:154784 [2] NCCL INFO Channel 23/0 : 2[2] -> 1[1] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 21/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 22/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154783:154783 [1] NCCL INFO Channel 23/0 : 1[1] -> 0[0] via P2P/IPC/read
8a100-5:154502:154502 [0] NCCL INFO Connected all trees
8a100-5:154502:154502 [0] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 512 | 512
8a100-5:154502:154502 [0] NCCL INFO 24 coll channels, 0 collnet channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer
8a100-5:154783:154783 [1] NCCL INFO Connected all trees
8a100-5:154783:154783 [1] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 512 | 512
8a100-5:154783:154783 [1] NCCL INFO 24 coll channels, 0 collnet channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer
8a100-5:154784:154784 [2] NCCL INFO Connected all trees
8a100-5:154784:154784 [2] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 512 | 512
8a100-5:154784:154784 [2] NCCL INFO 24 coll channels, 0 collnet channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer
8a100-5:154785:154785 [3] NCCL INFO Connected all trees
8a100-5:154785:154785 [3] NCCL INFO threadThresholds 8/8/64 | 32/8/64 | 512 | 512
8a100-5:154785:154785 [3] NCCL INFO 24 coll channels, 0 collnet channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer
8a100-5:154784:154784 [2] NCCL INFO comm 0xc121020 rank 2 nranks 4 cudaDev 2 nvmlDev 2 busId 300000 commId 0x9ae34328ec058ff9 - Init COMPLETE
8a100-5:154502:154502 [0] NCCL INFO comm 0xc124fa0 rank 0 nranks 4 cudaDev 0 nvmlDev 0 busId 100000 commId 0x9ae34328ec058ff9 - Init COMPLETE
8a100-5:154783:154783 [1] NCCL INFO comm 0xc121680 rank 1 nranks 4 cudaDev 1 nvmlDev 1 busId 200000 commId 0x9ae34328ec058ff9 - Init COMPLETE
8a100-5:154785:154785 [3] NCCL INFO comm 0xc120c40 rank 3 nranks 4 cudaDev 3 nvmlDev 3 busId 400000 commId 0x9ae34328ec058ff9 - Init COMPLETE
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:49 custom_all_reduce_utils.py:232] reading GPU P2P access cache from /home/a100user/.config/vllm/gpu_p2p_access_cache_for_0,1,2,3.json
INFO 07-22 07:59:49 custom_all_reduce_utils.py:232] reading GPU P2P access cache from /home/a100user/.config/vllm/gpu_p2p_access_cache_for_0,1,2,3.json
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:49 custom_all_reduce_utils.py:232] reading GPU P2P access cache from /home/a100user/.config/vllm/gpu_p2p_access_cache_for_0,1,2,3.json
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:49 custom_all_reduce_utils.py:232] reading GPU P2P access cache from /home/a100user/.config/vllm/gpu_p2p_access_cache_for_0,1,2,3.json
(VllmWorkerProcess pid=154783) INFO 07-22 07:59:51 weight_utils.py:218] Using model weights format ['*.safetensors']
INFO 07-22 07:59:51 weight_utils.py:218] Using model weights format ['*.safetensors']
(VllmWorkerProcess pid=154785) INFO 07-22 07:59:51 weight_utils.py:218] Using model weights format ['*.safetensors']
(VllmWorkerProcess pid=154784) INFO 07-22 07:59:51 weight_utils.py:218] Using model weights format ['*.safetensors']

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions