请问大家知道llama有支持在nv卡上运行时的kernel fusion的实现吗? #313
Didymos056
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
类似这里提到的几个operator的fusion,https://ai.lefebvre-sarrut.eu/2023/07/20/deep-dive-into-kernel-fusion-accelerating-inference-in-llama-v2/,包括RMS之类的
Beta Was this translation helpful? Give feedback.
All reactions