[CI] Pooling models mteb test uses enforce_eager #22878

noooop · 2025-08-14T05:46:53Z

Purpose

Fix another flaky test by increasing tolerance. Related to #22862

mteb_test uses enforce_eager
MTEB_EMBED_TOL = 1e-4 -> 0.02
Add print("Model:", model_info.name) to make it easier for the regular expression to retrieve the information

import re

file = open("name.log").read()

pattern = re.compile(r'^.*Model: (.+)\n.*VLLM: (.+) (.+)\n.+SentenceTransformers: (.+) (.+)\n.+Difference: (.+)\n', re.M)

match_obj = pattern.findall(file)
for name, dtype, vllm_main_score, st_dtype, st_main_score, diff in match_obj:
    print(name, dtype, vllm_main_score, st_dtype, st_main_score, diff)

Test Plan

mteb_test_embed_models

Test Result

pass

(Optional) Documentation Update

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

github-actions · 2025-08-14T05:47:01Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request introduces a change to default pooling models to use eager execution, addressing potential numerical precision issues with torch.compile. The implementation correctly makes the enforce_eager configuration optional and sets its default value based on the model's runner type. My review identified a critical issue where a new assertion could cause a crash if model_config is not set. I have provided a suggestion to fix this issue. Otherwise, the changes are sound.

vllm/config/__init__.py

Signed-off-by: wang.yuqi <noooop@126.com>

noooop · 2025-08-15T03:02:36Z

cc @DarkLight1337

DarkLight1337

Yeah let's increase the tolerance for now, seems that there are more and more tests that fail the CI

noooop · 2025-08-15T03:30:02Z

Yeah let's increase the tolerance for now, seems that there are more and more tests that fail the CI

First, exclude the effects caused by torch compile, which might be the biggest difference between v0 and v1.

I plan to conduct a long-term numerical precision statistics to see which factors affect numerical precision.

noooop · 2025-08-15T06:04:18Z

@DarkLight1337

Let's observe for a long time.

20250815 baseline:

name	dtype	vllm_main_score	st_dtype	st_main_score	diff
jinaai/jina-embeddings-v3	torch.bfloat16	0.824464932	torch.float32	0.824413164	0.00005
BAAI/bge-m3	torch.float16	0.787319813	torch.float32	0.787343078	0.00002
intfloat/multilingual-e5-base	torch.float16	0.779364278	torch.float32	0.779325955	0.00004
BAAI/bge-base-en	torch.float16	0.779340357	torch.float32	0.779336792	0.00000
Alibaba-NLP/gte-multilingual-base	torch.float16	0.775063329	torch.float32	0.775074696	0.00001
Qwen/Qwen3-Embedding-0.6B	torch.float32	0.771150305	torch.float32	0.771163695	0.00001
thenlper/gte-large	torch.float16	0.768054694	torch.float32	0.76807651	0.00002
BAAI/bge-code-v1	torch.float32	0.757248666	torch.float32	0.75724465	0.00000
Alibaba-NLP/gte-modernbert-base	torch.float16	0.748158342	torch.float32	0.748193353	0.00004
Alibaba-NLP/gte-base-en-v1.5	torch.float16	0.74427673	torch.float32	0.744314039	0.00004
intfloat/e5-small	torch.float16	0.74229612	torch.float32	0.742285423	0.00001
Alibaba-NLP/gte-large-en-v1.5	torch.float16	0.73928701	torch.float32	0.739301913	0.00001
nomic-ai/nomic-embed-text-v1	torch.float16	0.737572204	torch.float32	0.737568559	0.00000
nomic-ai/nomic-embed-text-v2-moe	torch.float16	0.715533087	torch.float32	0.715488912	0.00004
Snowflake/snowflake-arctic-embed-xs	torch.float16	0.714940761	torch.float32	0.714927797	0.00001
Snowflake/snowflake-arctic-embed-l-v2.0	torch.float16	0.712266816	torch.float32	0.712258299	0.00001
Snowflake/snowflake-arctic-embed-m-v2.0	torch.float16	0.706645738	torch.float32	0.706622444	0.00002
Snowflake/snowflake-arctic-embed-m-long	torch.float16	0.681194513	torch.float32	0.681146831	0.00005
Snowflake/snowflake-arctic-embed-m-v1.5	torch.float16	0.649126693	torch.float32	0.649088363	0.00004

noooop · 2025-08-15T06:21:29Z

@DarkLight1337

Can we merge this PR?

DarkLight1337 · 2025-08-15T08:15:57Z

The test passed so sure

Signed-off-by: wang.yuqi <noooop@126.com>

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Boyuan Feng <boyuan@meta.com>

Signed-off-by: wang.yuqi <noooop@126.com>

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: wang.yuqi <noooop@126.com>

noooop requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad, hmellor, yewentao256 and ProExpertProg as code owners August 14, 2025 05:46

gemini-code-assist bot reviewed Aug 14, 2025

View reviewed changes

vllm/config/__init__.py Outdated Show resolved Hide resolved

noooop requested review from DarkLight1337 and ywang96 as code owners August 14, 2025 05:48

noooop force-pushed the pooling_enforce_eager branch from 847ad4a to 88787ba Compare August 14, 2025 05:51

noooop marked this pull request as draft August 14, 2025 06:03

noooop force-pushed the pooling_enforce_eager branch from cf40990 to 39f9f90 Compare August 14, 2025 06:04

noooop changed the title ~~[Model] Pooling models default to using enforce_eager~~ [WIP] Try to fix numerical issues in embedding models Aug 14, 2025

noooop closed this Aug 14, 2025

enforce_eager

d3c9154

Signed-off-by: wang.yuqi <noooop@126.com>

noooop reopened this Aug 15, 2025

noooop force-pushed the pooling_enforce_eager branch from 39f9f90 to d3c9154 Compare August 15, 2025 02:58

Merge branch 'vllm-project:main' into pooling_enforce_eager

e912622

noooop marked this pull request as ready for review August 15, 2025 03:00

noooop changed the title ~~[WIP] Try to fix numerical issues in embedding models~~ [CI] Pooling models mteb test uses enforce_eager Aug 15, 2025

noooop mentioned this pull request Aug 15, 2025

[CI/Build] Increase pooling atol #22926

Closed

4 tasks

Merge branch 'main' into pooling_enforce_eager

562323a

DarkLight1337 reviewed Aug 15, 2025

View reviewed changes

vllm-bot merged commit 5406ebf into vllm-project:main Aug 15, 2025
16 checks passed

noooop deleted the pooling_enforce_eager branch August 18, 2025 06:00

juuice-lee pushed a commit to juuice-lee/vllm-moe.code that referenced this pull request Aug 18, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

6db7489

Signed-off-by: wang.yuqi <noooop@126.com>

yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Aug 19, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

097522c

Signed-off-by: wang.yuqi <noooop@126.com>

divakar-amd pushed a commit to divakar-amd/vllm_upstream that referenced this pull request Aug 20, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

5c1e8ce

Signed-off-by: wang.yuqi <noooop@126.com>

Gh0u1L5 pushed a commit to Gh0u1L5/vllm that referenced this pull request Aug 21, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

ee1346a

Signed-off-by: wang.yuqi <noooop@126.com>

noooop mentioned this pull request Aug 21, 2025

[Performance] V1 Pooling Models E2E Performance Optimization #23162

Merged

4 tasks

djmmoss pushed a commit to djmmoss/vllm that referenced this pull request Aug 21, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

041fa23

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

BoyuanFeng pushed a commit to BoyuanFeng/vllm that referenced this pull request Aug 21, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

23a8f23

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Boyuan Feng <boyuan@meta.com>

noooop mentioned this pull request Aug 26, 2025

[Bug]: Numerics of Embedding Models #22862

Open

1 task

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

4301d00

Signed-off-by: wang.yuqi <noooop@126.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

ff5c81e

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

49fb819

Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

f505a03

Signed-off-by: wang.yuqi <noooop@126.com>

dumb0002 pushed a commit to dumb0002/vllm that referenced this pull request Aug 28, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

57a5512

Signed-off-by: wang.yuqi <noooop@126.com>

googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025

[CI] Pooling models mteb test uses enforce_eager (vllm-project#22878)

eee5ca3

Signed-off-by: wang.yuqi <noooop@126.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI] Pooling models mteb test uses enforce_eager #22878

[CI] Pooling models mteb test uses enforce_eager #22878

Uh oh!

noooop commented Aug 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

noooop commented Aug 15, 2025 •

edited

Loading

Uh oh!

DarkLight1337 left a comment

Uh oh!

noooop commented Aug 15, 2025

Uh oh!

noooop commented Aug 15, 2025 •

edited

Loading

Uh oh!

noooop commented Aug 15, 2025

Uh oh!

DarkLight1337 commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[CI] Pooling models mteb test uses enforce_eager #22878

[CI] Pooling models mteb test uses enforce_eager #22878

Uh oh!

Conversation

noooop commented Aug 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

noooop commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

noooop commented Aug 15, 2025

Uh oh!

noooop commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noooop commented Aug 15, 2025

Uh oh!

DarkLight1337 commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

noooop commented Aug 14, 2025 •

edited by github-actions bot

Loading

noooop commented Aug 15, 2025 •

edited

Loading

noooop commented Aug 15, 2025 •

edited

Loading