(LLama-2-13b-hf) - Qlora - SFT - RuntimeError: mat1 and mat2 shapes cannot be multiplied (3264x5120 and 1x2560)

```
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-13b-hf \
    --do_train \
    --dataset oaast_sft \
    --finetuning_type lora \
    --quantization_bit 4 \
    --output_dir /workspace/llama-2-output \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 100 \
    --learning_rate 2e-5 \
    --num_train_epochs 0.5 \
    --plot_loss \
    --fp16

```


Server startup script
```
pip install --upgrade huggingface_hub
huggingface-cli login --token $HF_TOKEN
git clone https://github.com/hiyouga/LLaMA-Efficient-Tuning.git
cd LLaMA-Efficient-Tuning
pip install -r requirements.txt
pip install bitsandbytes>=0.39.0
pip install scipy
pip install -U git+https://github.com/huggingface/peft.git
```

Older closed issue suggests upgrading PEFT (peft-0.5.0.dev0) solves but I continue to receive same error.

This only happens with Llama-2-13b-hf and I am able to successfully SFT Vicuna and other models. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(LLama-2-13b-hf) - Qlora - SFT - RuntimeError: mat1 and mat2 shapes cannot be multiplied (3264x5120 and 1x2560) #202

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

(LLama-2-13b-hf) - Qlora - SFT - RuntimeError: mat1 and mat2 shapes cannot be multiplied (3264x5120 and 1x2560) #202

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions