-
Notifications
You must be signed in to change notification settings - Fork 77
Add KVCache trans for long sequence && tuned comm for faster Addreduce #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4cfb6d9 to
99731f4
Compare
|
|
||
| /** | ||
| * Tensor specially designed for KV Cache | ||
| * Naturaly, it could be represented in the shape of [seq_length][batch_size][head_num][head_size] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also modify the comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done~ New layout default disabled.
src/models/env_config.cpp
Outdated
| static int kvTrans = -1; | ||
| if (kvTrans == -1) { | ||
| kvTrans = (getenv("ENABLE_KV_TRANS") ? atoi(getenv("ENABLE_KV_TRANS")) : 0); | ||
| // if (kvTrans == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove it if not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments.
| return catMlp == 1; | ||
| } | ||
|
|
||
| bool tunedComm() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not so easy to understand "Tuned communication". add some comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added~
9dd2a0f to
fc3e8ff
Compare
fc3e8ff to
5da8280
Compare
Transpose KVCache with env var "ENABLE_KV_TRANS" for long sequence
Tune addreduce between shm and ccl for faster comm