Skip to content

[CI Failure]: Quantized Models Test - models/quantization/test_gguf.py::test_models[1-5-32-half-model0] #19458

Open
@mgoin

Description

@mgoin

Name of failing test

models/quantization/test_gguf.py::test_models[1-5-32-half-model0]

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

This specific Llama 1B GGUF model test has been failing consistently in multiple PRs https://buildkite.com/vllm/ci/builds/21800/steps/waterfall?jid=01975af4-f581-4d43-a1e5-7175d960b2b7#01975af4-f581-4d43-a1e5-7175d960b2b7/212-6971


[2025-06-10T18:40:56Z] FAILED models/quantization/test_gguf.py::test_models[1-5-32-half-model0] - AssertionError: Test0:
[2025-06-10T18:40:56Z] Matched tokens:	[4897, 596, 4495, 13, 650, 4178, 44, 13656, 369]
[2025-06-10T18:40:56Z] original:	"That's correct. VLLM stands for Vision and Language Model, which is a type of large language model designed for both inference and serving. It's a"	{31541: Logprob(logprob=-1.6094070672988892, rank=1, decoded_token='ĠVision'), 28968: Logprob(logprob=-2.0000319480895996, rank=2, decoded_token='ĠVari'), 8519: Logprob(logprob=-2.5000319480895996, rank=3, decoded_token='ĠVideo'), 21382: Logprob(logprob=-2.6562819480895996, rank=4, decoded_token='ĠVirtual'), 20796: Logprob(logprob=-2.7187819480895996, rank=5, decoded_token='ĠVisual')}
[2025-06-10T18:40:56Z] gguf:	"That's correct. VLLM stands for Virtual Language Learning Model, which is a type of large language model designed for high-throughput and memory-efficient inference and"	{21382: Logprob(logprob=-1.9463169574737549, rank=1, decoded_token='ĠVirtual'), 330: Logprob(logprob=-2.274441957473755, rank=2, decoded_token='Ġ"'), 15668: Logprob(logprob=-2.383816957473755, rank=3, decoded_token='ĠVery'), 4196: Logprob(logprob=-2.446316957473755, rank=4, decoded_token='ĠVal'), 28968: Logprob(logprob=-2.540066957473755, rank=5, decoded_token='ĠVari')}

📝 History of failing test

Earliest failure I found was at Mon 26th May at 8:27 AM
[CI/Build] Split pooling and generation extended language models tests in CI (#18705)
https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/94a54396-ec5f-8d47-8b48-6c88a2d4e5cb?period=28days&tags=scm.branch%3Amain&execution_id=01970c90-0b2c-7f2b-b3ad-d7bcc06f340b

CC List.

No response

Metadata

Metadata

Labels

ci-failureIssue about an unexpected test failure in CI

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions