[Misc] Add override for allreduce fusion thresholds #23639

nvjullin · 2025-08-26T08:58:15Z

Purpose

Long term plan is being discussed at #22086, but it is not easy and will take some more time.
Meanwhile, add an override so we can benchmark the effect of tuning the thresholds and unblock people.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Julien Lin <jullin@nvidia.com>

gemini-code-assist

Code Review

This pull request introduces an environment variable to override allreduce fusion thresholds, which is a good feature for benchmarking. However, the implementation lacks robust error handling for the user-provided configuration. My review includes suggestions to prevent application crashes from malformed JSON or invalid data types in the environment variable, improving the overall robustness of this new feature.

vllm/compilation/collective_fusion.py

vllm/envs.py

Signed-off-by: Julien Lin <jullin@nvidia.com>

mgoin

LGTM, thank you!

ProExpertProg · 2025-08-26T14:25:51Z

vllm/envs.py

+    #     { <world size>: <max size in mb> }
+    # Unspecified world sizes will fallback to
+    #     { 2: 64, 4: 1, <everything else>: 0.5 }
+    "VLLM_FLASHINFER_ALLREDUCE_FUSION_THRESHOLDS_MB":


Could we make this a CompilationConfig.PassConfig variable? It might be good to be tunable even after #22086 lands

ProExpertProg · 2025-08-26T14:27:46Z

vllm/compilation/collective_fusion.py

+        _FI_MAX_SIZES.update({
+            int(k): int(float(v) * MiB)
+            for k, v in
+            envs.VLLM_FLASHINFER_ALLREDUCE_FUSION_THRESHOLDS_MB.items()


I think reading this will only run at startup but we generally want to read envs during init in case they change between different LLM instantiations. Putting it in config and reading it at pass init time would be best.

@ProExpertProg @ilmarkov Should we just replace fi_allreduce_fusion_max_token_num with something like fi_allreduce_fusion_thresholds_mb? Why do we need to check both num_tokens and message size?

https://github.com/vllm-project/vllm/blob/main/vllm/config/compilation.py#L90

cc @nvjullin

ProExpertProg · 2025-08-26T22:38:06Z

@nvjullin did not realize this was on automerge, can you address comments in a follow-up please?

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: tc-mb <caitianchi@modelbest.cn>

Signed-off-by: Julien Lin <jullin@nvidia.com>

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: Julien Lin <jullin@nvidia.com>

nvjullin added 2 commits August 26, 2025 08:54

added override for allreduce fusion thresholds

4124d74

Signed-off-by: Julien Lin <jullin@nvidia.com>

better comment

d3c8887

Signed-off-by: Julien Lin <jullin@nvidia.com>

nvjullin requested review from ProExpertProg, youkaichao and zou3519 as code owners August 26, 2025 08:58

gemini-code-assist bot reviewed Aug 26, 2025

View reviewed changes

vllm/compilation/collective_fusion.py Outdated Show resolved Hide resolved

vllm/envs.py Outdated Show resolved Hide resolved

nvjullin changed the title ~~Add override for allreduce fusion thresholds~~ [Misc] Add override for allreduce fusion thresholds Aug 26, 2025

nvjullin added 2 commits August 26, 2025 09:13

added better error message

b2cdd79

Signed-off-by: Julien Lin <jullin@nvidia.com>

allreduce should be in the name

8683d65

Signed-off-by: Julien Lin <jullin@nvidia.com>

mgoin approved these changes Aug 26, 2025

View reviewed changes

mgoin enabled auto-merge (squash) August 26, 2025 13:53

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 26, 2025

ProExpertProg reviewed Aug 26, 2025

View reviewed changes

mgoin merged commit 7ea22e4 into vllm-project:main Aug 26, 2025
45 of 46 checks passed

tc-mb pushed a commit to tc-mb/vllm that referenced this pull request Aug 27, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

c78199f

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: tc-mb <caitianchi@modelbest.cn>

nvjullin mentioned this pull request Aug 27, 2025

[Misc] Moved override for allreduce fusion thresholds from env var to config #23722

Open

5 tasks

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

a910cb2

Signed-off-by: Julien Lin <jullin@nvidia.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

b683fbd

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

b90d384

Signed-off-by: Julien Lin <jullin@nvidia.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

f83013c

Signed-off-by: Julien Lin <jullin@nvidia.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Misc] Add override for allreduce fusion thresholds (vllm-project#23639)

2588b5f

Signed-off-by: Julien Lin <jullin@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Add override for allreduce fusion thresholds #23639

[Misc] Add override for allreduce fusion thresholds #23639

Uh oh!

nvjullin commented Aug 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Uh oh!

ProExpertProg Aug 26, 2025

Uh oh!

ProExpertProg Aug 26, 2025

Uh oh!

nvpohanh Aug 27, 2025

Uh oh!

nvpohanh Aug 27, 2025

Uh oh!

Uh oh!

ProExpertProg commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Misc] Add override for allreduce fusion thresholds #23639

[Misc] Add override for allreduce fusion thresholds #23639

Uh oh!

Conversation

nvjullin commented Aug 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

nvpohanh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

nvpohanh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ProExpertProg commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nvjullin commented Aug 26, 2025 •

edited by github-actions bot

Loading