Skip to content

Commit

Permalink
expand coverage of gpt2 model loading (#271)
Browse files Browse the repository at this point in the history
  • Loading branch information
twaka authored Jun 27, 2023
1 parent 43710e8 commit 4026a04
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions vllm/model_executor/models/gpt2.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,11 +228,13 @@ def load_weights(self, model_name_or_path: str,
# GPT-2 ties the weights of the embedding layer and the final
# linear layer.
continue
if ".attn.bias" in name:
if ".attn.bias" in name or ".attn.masked_bias" in name:
# Skip attention mask.
# NOTE: "c_attn.bias" should not be skipped.
continue
name = "transformer." + name

if not name.startswith("transformer."):
name = "transformer." + name

# The HF's GPT-2 implementation uses Conv1D instead of Linear.
# Because of this, we need to transpose the weights.
Expand Down

0 comments on commit 4026a04

Please sign in to comment.