Skip to content

Conversation

@dc3671
Copy link
Contributor

@dc3671 dc3671 commented Jul 24, 2023

Problem

After llama2 update in transformers [Llama2] Add support for Llama 2, the autoTP for llama is broken.
image

Reason

After dig into transformers modification, I noticed that there is another self.num_key_value_heads attribute in the attention module:
image
https://github.com/huggingface/transformers/pull/24891/files#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51R253

Solution

In DeepSpeed, this part is controlled by a hardcoded name list in update_mp_params of autoTP's replace_wo_policy(see this PR's modification). So I added one more key in this list, and the problem is fixed.

@tjruwase @jeffra please kindly review, thanks~

@mrwyattii mrwyattii added this pull request to the merge queue Jul 24, 2023
Merged via the queue into deepspeedai:master with commit f3943cf Jul 24, 2023
@dc3671 dc3671 deleted the fix-llama2-autotp branch August 8, 2023 06:37
blzheng pushed a commit to blzheng/DeepSpeedSYCLSupport that referenced this pull request Aug 15, 2023
delock pushed a commit to delock/DeepSpeedSYCLSupport that referenced this pull request Aug 16, 2023
* add lm_head tensor parallel

* fix conflict

* add embed_out tp

* add llama2 autoTP support in replace_module (deepspeedai#4022)

---------

Co-authored-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Dino Chen <zhenhuan.chen@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants