Open
Description
Name of failing test
models/quantization/test_gguf.py::test_models[1-5-32-half-model0]
Basic information
- Flaky test
- Can reproduce locally
- Caused by external libraries (e.g. bug in
transformers
)
🧪 Describe the failing test
This specific Llama 1B GGUF model test has been failing consistently in multiple PRs https://buildkite.com/vllm/ci/builds/21800/steps/waterfall?jid=01975af4-f581-4d43-a1e5-7175d960b2b7#01975af4-f581-4d43-a1e5-7175d960b2b7/212-6971
[2025-06-10T18:40:56Z] FAILED models/quantization/test_gguf.py::test_models[1-5-32-half-model0] - AssertionError: Test0:
[2025-06-10T18:40:56Z] Matched tokens: [4897, 596, 4495, 13, 650, 4178, 44, 13656, 369]
[2025-06-10T18:40:56Z] original: "That's correct. VLLM stands for Vision and Language Model, which is a type of large language model designed for both inference and serving. It's a" {31541: Logprob(logprob=-1.6094070672988892, rank=1, decoded_token='ĠVision'), 28968: Logprob(logprob=-2.0000319480895996, rank=2, decoded_token='ĠVari'), 8519: Logprob(logprob=-2.5000319480895996, rank=3, decoded_token='ĠVideo'), 21382: Logprob(logprob=-2.6562819480895996, rank=4, decoded_token='ĠVirtual'), 20796: Logprob(logprob=-2.7187819480895996, rank=5, decoded_token='ĠVisual')}
[2025-06-10T18:40:56Z] gguf: "That's correct. VLLM stands for Virtual Language Learning Model, which is a type of large language model designed for high-throughput and memory-efficient inference and" {21382: Logprob(logprob=-1.9463169574737549, rank=1, decoded_token='ĠVirtual'), 330: Logprob(logprob=-2.274441957473755, rank=2, decoded_token='Ġ"'), 15668: Logprob(logprob=-2.383816957473755, rank=3, decoded_token='ĠVery'), 4196: Logprob(logprob=-2.446316957473755, rank=4, decoded_token='ĠVal'), 28968: Logprob(logprob=-2.540066957473755, rank=5, decoded_token='ĠVari')}
📝 History of failing test
Earliest failure I found was at Mon 26th May at 8:27 AM
[CI/Build] Split pooling and generation extended language models tests in CI (#18705)
https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/94a54396-ec5f-8d47-8b48-6c88a2d4e5cb?period=28days&tags=scm.branch%3Amain&execution_id=01970c90-0b2c-7f2b-b3ad-d7bcc06f340b
CC List.
No response
Metadata
Metadata
Type
Projects
Status
No status