Update to torch==2.6.0 #12721

mgoin · 2025-02-04T01:44:29Z

Only updates for CUDA. Successfully built locally on H100 CUDA 12.5 system and tested with vllm serve meta-llama/Llama-3.1-8B-Instruct

We should upgrade other hardware backends separately. For instance, CPU is blocked by IPEX in the Dockerfile.cpu

FIX #12719

Signed-off-by: mgoin <michael@neuralmagic.com>

github-actions · 2025-02-04T01:44:39Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: mgoin <michael@neuralmagic.com>

tlrmchlsmth

Nice, CI looks green

houseroad · 2025-02-04T05:29:32Z

Shall we merge #12393 first? cc: @youkaichao

fialhocoelho

LGTM. I built vLLM by merging this PR, and it worked perfectly 🚀

mgoin · 2025-02-04T15:28:57Z

Confirmed that this update will break V1 at the current state, we should wait for #12393 at least

VLLM_USE_V1=1 vllm serve meta-llama/Llama-3.1-8B-Instruct
...
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/code/vllm/vllm/compilation/backends.py", line 616, in __call__
ERROR 02-04 15:27:21 core.py:210]     PiecewiseCompileInterpreter(self.split_gm, submod_names_to_compile,
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/code/vllm/vllm/compilation/backends.py", line 424, in run
ERROR 02-04 15:27:21 core.py:210]     return super().run(*fake_args)
ERROR 02-04 15:27:21 core.py:210]            ^^^^^^^^^^^^^^^^^^^^^^^
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/venvs/vllm/lib/python3.12/site-packages/torch/fx/interpreter.py", line 167, in run
ERROR 02-04 15:27:21 core.py:210]     self.env[node] = self.run_node(node)
ERROR 02-04 15:27:21 core.py:210]                      ^^^^^^^^^^^^^^^^^^^
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/venvs/vllm/lib/python3.12/site-packages/torch/fx/interpreter.py", line 230, in run_node
ERROR 02-04 15:27:21 core.py:210]     return getattr(self, n.op)(n.target, args, kwargs)
ERROR 02-04 15:27:21 core.py:210]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/code/vllm/vllm/compilation/backends.py", line 439, in call_module
ERROR 02-04 15:27:21 core.py:210]     compiled_graph_for_general_shape = wrap_inductor(
ERROR 02-04 15:27:21 core.py:210]                                        ^^^^^^^^^^^^^^
ERROR 02-04 15:27:21 core.py:210]   File "/home/mgoin/code/vllm/vllm/compilation/backends.py", line 254, in wrap_inductor
ERROR 02-04 15:27:21 core.py:210]     original_load = FxGraphCache.load
ERROR 02-04 15:27:21 core.py:210]                     ^^^^^^^^^^^^^^^^^
ERROR 02-04 15:27:21 core.py:210] torch._dynamo.exc.BackendCompilerFailed: backend='<vllm.compilation.backends.VllmBackend object at 0x71985bc685c0>' raised:
ERROR 02-04 15:27:21 core.py:210] AttributeError: type object 'FxGraphCache' has no attribute 'load'
ERROR 02-04 15:27:21 core.py:210] 
ERROR 02-04 15:27:21 core.py:210] While executing %submod_0 : [num_users=5] = call_module[target=submod_0](args = (%l_input_ids_, %s0, %l_self_modules_embed_tokens_parameters_weight_, %l_self_modules_layers_modules_0_modules_input_layernorm_parameters_weight_, %l_self_modules_layers_modules_0_modules_self_attn_modules_qkv_proj_parameters_weight_, %l_positions_, %l_self_modules_layers_modules_0_modules_self_attn_modules_rotary_emb_buffers_cos_sin_cache_), kwargs = {})

youkaichao · 2025-02-04T16:48:42Z

@mgoin can you help review and stamp that PR?

zhouyuan · 2025-02-07T06:23:48Z

@mgoin Thanks a lot for the update. IPEX CPU w/ PT 2.6 will be released next week. Will update on this as soon as the binary is out.

Cc: @Guobing-Chen @bigPYJ1151

Thanks, -yuan

mergify · 2025-02-10T08:30:37Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mgoin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

jiangshaoping · 2025-02-10T14:49:35Z

I wanna when this PR will be merged？

Signed-off-by: mgoin <mgoin64@gmail.com>

Signed-off-by: luka <luka@neuralmagic.com>

tlrmchlsmth · 2025-03-14T15:34:09Z

Possibly good to go now?? 🤞 🤞

edit: of course not -- I'll fix the pre-commit

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

xihuai18 · 2025-03-17T18:42:43Z

hi, How can I build vllm using torch 2.5.1 after this PR? Was there anyone succeeded?

ProExpertProg · 2025-03-17T20:42:33Z

hi, How can I build vllm using torch 2.5.1 after this PR? Was there anyone succeeded?

Can you try pip install -e . --no-build-isolation in an environment with torch==2.5.1 already installed?

xihuai18 · 2025-03-18T02:13:57Z

hi, How can I build vllm using torch 2.5.1 after this PR? Was there anyone succeeded?

Can you try pip install -e . --no-build-isolation in an environment with torch==2.5.1 already installed?

I am trying:

git clone https://github.com/vllm-project/vllm.git
cd vllm
python use_existing_torch.py
pip install -r requirements/build.txt
pip install -e . --no-build-isolation
``

ProExpertProg · 2025-03-18T03:18:37Z

Are you getting an error? You might need to downgrade other dependencies as well, that would be my only other guess.

xihuai18 · 2025-03-18T03:31:53Z

Are you getting an error? You might need to downgrade other dependencies as well, that would be my only other guess.

I am building wheels for torch 2.5.1, but I meet many errors. I hope vllm could officially provide wheels for torch2.5.1 since torch2.6.0 would lead to many dependence problems when using vllm with some integrations such as verl or ms-swift.

ProExpertProg · 2025-03-18T03:43:09Z

Could you create a new issue and post the errors? I don't think providing official 2.5.1 wheels is on the roadmap for v0.8.0+. But you're welcome to use an earlier version or cherry pick the commits you need.

Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: luka <luka@neuralmagic.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: luka <luka@neuralmagic.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: luka <luka@neuralmagic.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

Summary: In torch 2.6.0, torch accidentally changed the default for custom operators to be "requires_contiguous". As a workaround, vLLM added needs_fixed_stride_order to a large number of custom operators. vLLM is currently on torch 2.7.0 which has reverted the default for custom operators back to needs_fixed_stride_order. This PR cleans up the kernel logic by flipping the default back. The other reason why I want to flip the default back is that needs_fixed_stride_order is actually buggy and torch 2.8.0 has better behavior for custom operators with no layout tags set. Also Kaichao tells me that some backends may not have moved to PyTorch 2.7.0 yet (vllm-project#8932) so I didn't delete the code in this PR. Test Plan: - Existing tests - Ran `pytest tests/compile/test_full_graph.py` (this was the test that originally caused us to add the needs_fixed_stride_order tag, see vllm-project#12721 for context) Signed-off-by: rzou <zou3519@gmail.com>

Update to torch==2.6.0

bdf48f0

Signed-off-by: mgoin <michael@neuralmagic.com>

mgoin requested a review from tlrmchlsmth as a code owner February 4, 2025 01:44

mergify bot added the ci/build label Feb 4, 2025

Update xformers

84c62c3

Signed-off-by: mgoin <michael@neuralmagic.com>

zhuohan123 mentioned this pull request Feb 4, 2025

[Installation]: Supporting PyTorch 2.6? #12719

Closed

1 task

mgoin changed the title ~~Update to torch==2.6.0~~ [WIP] Update to torch==2.6.0 Feb 4, 2025

tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 4, 2025

mgoin changed the title ~~[WIP] Update to torch==2.6.0~~ Update to torch==2.6.0 Feb 4, 2025

Revert cpu

1f1814b

Signed-off-by: mgoin <michael@neuralmagic.com>

zhuohan123 approved these changes Feb 4, 2025

View reviewed changes

tlrmchlsmth approved these changes Feb 4, 2025

View reviewed changes

jeejeelee approved these changes Feb 4, 2025

View reviewed changes

fialhocoelho reviewed Feb 4, 2025

View reviewed changes

mgoin mentioned this pull request Feb 5, 2025

[torch.compile] PyTorch 2.6 and nightly compatibility #12393

Merged

Merge branch 'main' into update-torch-2.6.0

2e39723

mergify bot added the needs-rebase label Feb 10, 2025

Merge branch 'main' into update-torch-2.6.0

cad5a1a

mergify bot removed the needs-rebase label Feb 10, 2025

Update triton test dep

dc3d473

Signed-off-by: mgoin <mgoin64@gmail.com>

mgoin mentioned this pull request Feb 12, 2025

[CI/Build] Add support for Python 3.13 #13164

Merged

5 tasks

mgoin and others added 3 commits February 12, 2025 16:59

Fix requirements-test

adf40e4

Signed-off-by: mgoin <mgoin64@gmail.com>

Merge branch 'main' into update-torch-2.6.0

48738f8

Merge branch 'main' into update-torch-2.6.0

7a17ba3

[CI/Build] Upgrade bitsandbytes (#14825)

72f1ad9

Signed-off-by: luka <luka@neuralmagic.com>

tlrmchlsmth and others added 3 commits March 14, 2025 16:05

try it

952ff50

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

revert

7f2e18b

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

Make pre-commit happy

dc650e6

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

tlrmchlsmth merged commit 14f301b into main Mar 14, 2025
58 checks passed

tlrmchlsmth deleted the update-torch-2.6.0 branch March 14, 2025 20:58

pathorn mentioned this pull request Mar 18, 2025

Change logprobs to use int64 datatype in torch.gather #14999

Closed

engchina mentioned this pull request Mar 18, 2025

[Bug]: ValueError: No available memory for the cache blocks on main branch after commit 46f98893 #14992

Closed

1 task

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

zou3519 mentioned this pull request Jun 9, 2025

[Kernel] Apply torch.Tag.needs_fixed_stride_order only for torch==2.6.0 #19346

Merged

Uh oh!

Update to torch==2.6.0 #12721

Update to torch==2.6.0 #12721

Uh oh!

Conversation

mgoin commented Feb 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 4, 2025

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

houseroad commented Feb 4, 2025

Uh oh!

fialhocoelho left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin commented Feb 4, 2025

Uh oh!

youkaichao commented Feb 4, 2025

Uh oh!

zhouyuan commented Feb 7, 2025

Uh oh!

mergify bot commented Feb 10, 2025

Uh oh!

jiangshaoping commented Feb 10, 2025

Uh oh!

tlrmchlsmth commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

xihuai18 commented Mar 17, 2025

Uh oh!

ProExpertProg commented Mar 17, 2025

Uh oh!

xihuai18 commented Mar 18, 2025

Uh oh!

ProExpertProg commented Mar 18, 2025

Uh oh!

xihuai18 commented Mar 18, 2025

Uh oh!

ProExpertProg commented Mar 18, 2025

Uh oh!

Uh oh!

mgoin commented Feb 4, 2025 •

edited by github-actions bot

Loading

tlrmchlsmth commented Mar 14, 2025 •

edited

Loading