Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MFM-2025-02-21] Merge main to llama fp8, DeepSeekV3 and PTPC-FP8 #445

Merged
merged 1,120 commits into from
Feb 25, 2025

Conversation

tjtanaa
Copy link

@tjtanaa tjtanaa commented Feb 24, 2025

Please direct your PRs to the upstream vllm (https://github.com/vllm-project/vllm.git)

Accepting PRs into the ROCm fork (https://github.com/ROCm/vllm) will require a clear previously communicated exception

maleksan85 and others added 30 commits February 5, 2025 03:58
…lling (vllm-project#12713)

Signed-off-by: Aleksandr Malyshev <maleksan@amd.com>
Co-authored-by: Aleksandr Malyshev <maleksan@amd.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…llm-project#12634)

Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
…n ROCm (ROCm#406)

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Merged via CLI script
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
WoosukKwon and others added 26 commits February 16, 2025 09:39
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…ct#13362)

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…oject#12304)

Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: yan ma <yan.ma@intel.com>
* Enabling ROCm CI on MI250 machines:
- correct build target
- correct queue

Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>

---------

Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
* Optimization for quantized gemm skinny sizes

* lint fix

* Add support for bf16/fp16

* code cleanup

* code cleanup

* lint fix2

* cleanup

* Moved the logic into tuned gemm to preserve API compatibility

---------

Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

* Remove from custom kernels as well

---------

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops

* Make sure to use the submodule commit compatible with the main aiter commit
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
…edLinear layer

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
@tjtanaa tjtanaa marked this pull request as ready for review February 24, 2025 10:04
@hongxiayang hongxiayang merged commit d7fefdf into ROCm:llama_fp8_12062024 Feb 25, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.