[Bug]: Bus error (core dumped)

### Your current environment

<details>

<summary>The output of `python collect_env.py`</summary>

```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.26.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.2.0-39-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 12.3.107
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: Quadro RTX 8000
Nvidia driver version: 545.23.08
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.7
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      46 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             8
On-line CPU(s) list:                0-7
Vendor ID:                          GenuineIntel
Model name:                         Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
CPU family:                         6
Model:                              85
Thread(s) per core:                 1
Core(s) per socket:                 1
Socket(s):                          8
Stepping:                           7
BogoMIPS:                           4589.32
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush acpi mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single intel_ppin ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves pku ospke md_clear flush_l1d arch_capabilities
Hypervisor vendor:                  Xen
Virtualisation type:                full
L1d cache:                          256 KiB (8 instances)
L1i cache:                          256 KiB (8 instances)
L2 cache:                           8 MiB (8 instances)
L3 cache:                           176 MiB (8 instances)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-7
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        KVM: Mitigation: VMX unsupported
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Retbleed:             Mitigation; Enhanced IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected

Versions of relevant libraries:
[pip3] numpy==1.23.5
[pip3] nvidia-cublas-cu11==11.10.3.66
[pip3] nvidia-cublas-cu12==12.1.3.1
[pip3] nvidia-cuda-cupti-cu11==11.7.101
[pip3] nvidia-cuda-cupti-cu12==12.1.105
[pip3] nvidia-cuda-nvrtc-cu11==11.7.99
[pip3] nvidia-cuda-nvrtc-cu12==12.1.105
[pip3] nvidia-cuda-runtime-cu11==11.7.99
[pip3] nvidia-cuda-runtime-cu12==12.1.105
[pip3] nvidia-cudnn-cu11==8.5.0.96
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu11==10.9.0.58
[pip3] nvidia-cufft-cu12==11.0.2.54
[pip3] nvidia-curand-cu11==10.2.10.91
[pip3] nvidia-curand-cu12==10.3.2.106
[pip3] nvidia-cusolver-cu11==11.4.0.1
[pip3] nvidia-cusolver-cu12==11.4.5.107
[pip3] nvidia-cusparse-cu11==11.7.4.91
[pip3] nvidia-cusparse-cu12==12.1.0.106
[pip3] nvidia-ml-py==12.560.30
[pip3] nvidia-ml-py3==7.352.0
[pip3] nvidia-nccl-cu11==2.14.3
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] nvidia-nvjitlink-cu12==12.3.52
[pip3] nvidia-nvtx-cu11==11.7.91
[pip3] nvidia-nvtx-cu12==12.1.105
[pip3] pyzmq==26.2.0
[pip3] torch==2.4.0
[pip3] torch-fidelity==0.3.0
[pip3] torchinfo==1.8.0
[pip3] torchmetrics==1.4.0.post0
[pip3] torchvision==0.19.0
[pip3] transformers==4.45.1
[pip3] triton==3.0.0
[conda] cuda-version              12.6                          3    nvidia
[conda] nccl                      2.21.5.1             ha515578_0
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.5.4@4db5176d9758b720b05460c50ace3c01026eb158
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
GPU0    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      0-7     0               N/A

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks
```

</details>


### Model Input Dumps

_No response_

### 🐛 Describe the bug

Greetings, I was using vllm 0.5.2 just fine, but I want to do some constrained decoding and therefore upgraded to 0.6.2. Then I always encouter this Bus error when I attempt to inference.

I've tried to delete and re-create conda environment,
downgrade to 0.5.4
uninstall/reinstall nccl from source/apt/conda multiple times (according to #3916)
but non of this helped. However, my other machine works perfectly fine. My code is very standard

```
sampling_params = SamplingParams(temperature=0, top_p=1, max_tokens=20)
data =  [out]
outputs = llm.generate(data, sampling_params, guided_options_request=dict(guided_choice=["agree", "disagree"
    ]))
output_batch = [output.outputs[0].text for output in outputs]
```

Could someone help me out?

Much appreciated

bt from gdp core file
```
Core was generated by `python3 vllm_mpd.py'.
Program terminated with signal SIGBUS, Bus error.
#0  0x00007f7a3cc9ab8f in ?? ()
[Current thread is 1 (LWP 2829379)]
(gdb) bt
#0  0x00007f7a3cc9ab8f in ?? ()
#1  0x00007f7a58845064 in ?? ()
#2  0x0000556600000001 in ?? ()
#3  0x0000000000000009 in ?? ()
#4  0x00005566001e7999 in ?? ()
#5  0x00005566001e7999 in ?? ()
#6  0x7fffffffffffffff in ?? ()
#7  0x00007fffafe13320 in ?? ()
#8  0x00005565fffcdaf4 in ?? ()
#9  0x00005566020f7920 in ?? ()
#10 0x00007f7b1dd94ef0 in ?? ()
#11 0x00007f7a360405e8 in ?? ()
#12 0x00007f7b1e61ace0 in ?? ()
#13 0x00007f7a36040440 in ?? ()
#14 0x70941bcbe0b11000 in ?? ()
#15 0x00007f7a36040440 in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x00007fffafe1334c in ?? ()
#18 0x0000556608f286a8 in ?? ()
#19 0x0000556609238148 in ?? ()
#20 0x0000000000000000 in ?? ()
```

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Bus error (core dumped) #8974

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Bus error (core dumped) #8974

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions