Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel][Hardware][Amd]Custom paged attention kernel for rocm #8310

Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
8f79316
add kernel folder for hip and update CMake file
charlifu Aug 23, 2024
68230e8
Merge branch 'main' into charlifu/separate_paged_attention_kernel_on_…
charlifu Aug 25, 2024
f6e5a03
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Aug 26, 2024
4fcd064
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Aug 26, 2024
2007a8f
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Aug 27, 2024
19c0de2
Merge branch 'charlifu/separate_paged_attention_kernel_on_rocm' of ht…
charlifu Aug 27, 2024
5f605bc
register custom op
charlifu Aug 27, 2024
04a4ef6
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Sep 5, 2024
41ff949
Merge branch 'charlifu/separate_paged_attention_kernel_on_rocm' of ht…
charlifu Sep 5, 2024
9b6ec09
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Sep 9, 2024
554804b
add paged attention for rocm
charlifu Sep 9, 2024
a7623ea
Merge branch 'charlifu/separate_paged_attention_kernel_on_rocm' of ht…
charlifu Sep 9, 2024
2763778
enable custom page attn and unit test
charlifu Sep 9, 2024
5951a52
fix v1/v2
charlifu Sep 9, 2024
0f66eb9
add hip back to gitignore
charlifu Sep 10, 2024
79449fe
linting
charlifu Sep 10, 2024
6f40079
remove unneeded code
charlifu Sep 10, 2024
1e99bb1
Update CMakeLists.txt
charlifu Sep 12, 2024
2fc628c
add empty line and remove env
charlifu Sep 12, 2024
f573376
move kernel selection for rocm to rocm_flash_attn.py
charlifu Sep 13, 2024
208c9b3
remove redundant codes
charlifu Sep 13, 2024
4858ebe
Merge branch 'vllm-project:main' into charlifu/separate_paged_attenti…
charlifu Sep 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -313,12 +313,35 @@ define_gpu_extension_target(
WITH_SOABI)


if(VLLM_GPU_LANG STREQUAL "HIP")
#
# _rocm_C extension
#
set(VLLM_ROCM_EXT_SRC
"csrc/rocm/torch_bindings.cpp"
"csrc/rocm/attention.cu")

define_gpu_extension_target(
_rocm_C
DESTINATION vllm
LANGUAGE ${VLLM_GPU_LANG}
SOURCES ${VLLM_ROCM_EXT_SRC}
COMPILE_FLAGS ${VLLM_GPU_FLAGS}
ARCHITECTURES ${VLLM_GPU_ARCHES}
USE_SABI 3
WITH_SOABI)
endif()


if(VLLM_GPU_LANG STREQUAL "CUDA" OR VLLM_GPU_LANG STREQUAL "HIP")
message(STATUS "Enabling C extension.")
add_dependencies(default _C)

message(STATUS "Enabling moe extension.")
add_dependencies(default _moe_C)

endif()

if(VLLM_GPU_LANG STREQUAL "HIP")
message(STATUS "Enabling rocm extension.")
add_dependencies(default _rocm_C)
endif()
charlifu marked this conversation as resolved.
Show resolved Hide resolved
Loading
Loading