Skip to content

[BUG] <title>对minicpm-v-2.6进行lora微调时报错:TypeError: __init__() got an unexpected keyword argument 'init_vision' #907

Open
@OnlylLin

Description

@OnlylLin

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

我在按照飞书云文档lora微调minicpm-v-2.6,bash finetune_lora.sh时报错:

Image,我的脚本是这样的:#!/bin/bash
GPUS_PER_NODE=8
NNODES=1
NODE_RANK=0
MASTER_ADDR=localhost
MASTER_PORT=6001

MODEL="/home/xit/minicpm-v-2_6"
DATA="/home/xit/dataset/alldata.json"
EVAL_DATA="/home/xit/dataset/alldata.json"
LLM_TYPE="qwen2"

export NCCL_P2P_DISABLE=1 # a100等支持nccl_p2p的显卡去掉此行
export NCCL_IB_DISABLE=1 # a100等显卡去掉此行

DISTRIBUTED_ARGS="
--nproc_per_node $GPUS_PER_NODE
--nnodes $NNODES
--node_rank $NODE_RANK
--master_addr $MASTER_ADDR
--master_port $MASTER_PORT
"

CUDA_VISIBLE_DEVICES="4,5,6,7" torchrun $DISTRIBUTED_ARGS finetune.py
--model_name_or_path $MODEL
--llm_type $LLM_TYPE
--data_path $DATA
--eval_data_path $EVAL_DATA
--remove_unused_columns false
--label_names "labels"
--prediction_loss_only false
--bf16 false
--bf16_full_eval false
--fp16 true
--fp16_full_eval true
--do_train
--do_eval
--tune_vision true
--tune_llm false
--use_lora true
--lora_target_modules "llm..*layers.\d+.self_attn.(q_proj|k_proj|v_proj)"
--model_max_length 2048
--max_slice_nums 9
--max_steps 10000
--eval_steps 1000
--output_dir "/home/xit/output/loramodel"
--logging_dir "/home/xit/output/logging"
--logging_strategy "steps"
--per_device_train_batch_size 2
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--evaluation_strategy "steps"
--save_strategy "steps"
--save_steps 1000
--save_total_limit 10
--learning_rate 1e-6
--weight_decay 0.1
--adam_beta2 0.95
--warmup_ratio 0.01
--lr_scheduler_type "cosine"
--logging_steps 1
--gradient_checkpointing true
--deepspeed ds_config_zero2.json
--report_to "tensorboard"

期望行为 | Expected Behavior

1希望能解决这个报错
2我的服务器上有8张显卡,我只想使用4、5、6、7显卡,我在CUDA_VISIBLE_DEVICES="4,5,6,7" torchrun这个部分添加了内容,不知道是否正确。
3谢谢

复现方法 | Steps To Reproduce

1.conda创建新环境并安装pyhon3.9
2.pip requiremnet.txt
3.执行了以下命令:
git clone https://github.com/microsoft/DeepSpeed.git
cd DeepSpeed
DS_BUILD_FUSED_ADAM=1 pip install .
4.pip install peft
4.bash finetune_lora.sh然后报错

运行环境 | Environment

- OS:ubuntu22
- Python:3.9
- Transformers:4.40.0
- PyTorch:2.4.0
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.4
我的pip list:
accelerate                        1.2.0
addict                            2.4.0
aiofiles                          23.2.1
aiohappyeyeballs                  2.6.1
aiohttp                           3.11.16
aiosignal                         1.3.2
annotated-types                   0.7.0
anyio                             4.9.0
async-timeout                     5.0.1
attrs                             25.3.0
autoawq                           0.2.7.post2
bitsandbytes                      0.45.5
Brotli                            1.0.9
certifi                           2025.1.31
charset-normalizer                3.3.2
click                             8.1.8
cloudpickle                       3.1.1
cmake                             4.0.0
colorama                          0.4.6
contourpy                         1.3.0
cycler                            0.12.1
datasets                          3.5.0
decord                            0.6.0
deepspeed                         0.16.6+a21e5b9d
dill                              0.3.8
diskcache                         5.6.3
distro                            1.9.0
e                                 1.4.5
editdistance                      0.6.2
einops                            0.7.0
et_xmlfile                        2.0.0
exceptiongroup                    1.2.2
fairscale                         0.4.0
fastapi                           0.115.12
ffmpy                             0.5.0
filelock                          3.13.1
fonttools                         4.57.0
frozenlist                        1.5.0
fsspec                            2024.12.0
gmpy2                             2.2.1
gradio                            4.41.0
gradio_client                     1.3.0
h11                               0.14.0
hjson                             3.1.0
httpcore                          1.0.7
httptools                         0.6.4
httpx                             0.28.1
huggingface-hub                   0.30.1
idna                              3.7
importlib_resources               6.5.2
interegular                       0.3.3
Jinja2                            3.1.6
jiter                             0.9.0
joblib                            1.4.2
jsonlines                         4.0.0
jsonschema                        4.23.0
jsonschema-specifications         2024.10.1
kiwisolver                        1.4.7
lark                              1.2.2
llvmlite                          0.43.0
lm-format-enforcer                0.10.3
lxml                              5.3.2
markdown-it-py                    3.0.0
markdown2                         2.4.10
MarkupSafe                        2.1.5
matplotlib                        3.7.4
mdurl                             0.1.2
mkl_fft                           1.3.11
mkl_random                        1.2.8
mkl-service                       2.4.0
modelscope_studio                 0.4.0.9
more-itertools                    10.1.0
mpmath                            1.3.0
msgpack                           1.1.0
multidict                         6.3.2
multiprocess                      0.70.16
nest-asyncio                      1.6.0
networkx                          3.2.1
ninja                             1.11.1.4
nltk                              3.8.1
numba                             0.60.0
numpy                             1.24.4
nvidia-cublas-cu12                12.4.2.65
nvidia-cuda-cupti-cu12            12.4.99
nvidia-cuda-nvrtc-cu12            12.4.99
nvidia-cuda-runtime-cu12          12.4.99
nvidia-cudnn-cu12                 9.1.0.70
nvidia-cufft-cu12                 11.2.0.44
nvidia-curand-cu12                10.3.5.119
nvidia-cusolver-cu12              11.6.0.99
nvidia-cusparse-cu12              12.3.0.142
nvidia-ml-py                      12.570.86
nvidia-nccl-cu12                  2.20.5
nvidia-nvjitlink-cu12             12.4.99
nvidia-nvtx-cu12                  12.4.99
openai                            1.70.0
opencv-python                     4.11.0.86
opencv-python-headless            4.5.5.64
openpyxl                          3.1.2
orjson                            3.10.16
outlines                          0.0.46
packaging                         23.2
pandas                            2.2.3
peft                              0.11.1
Pillow                            10.1.0
pip                               25.0
portalocker                       3.1.1
prometheus_client                 0.21.1
prometheus-fastapi-instrumentator 7.1.0
propcache                         0.3.1
protobuf                          4.25.0
psutil                            7.0.0
py-cpuinfo                        9.0.0
pyairports                        2.1.1
pyarrow                           19.0.1
pycountry                         24.6.1
pydantic                          2.9.2
pydantic_core                     2.23.4
pydub                             0.25.1
Pygments                          2.19.1
pyparsing                         3.2.3
PySocks                           1.7.1
python-dateutil                   2.9.0.post0
python-dotenv                     1.1.0
python-multipart                  0.0.20
pytz                              2025.2
PyYAML                            6.0.2
pyzmq                             26.4.0
ray                               2.44.1
referencing                       0.36.2
regex                             2024.11.6
requests                          2.32.3
rich                              14.0.0
rpds-py                           0.24.0
ruff                              0.11.5
sacrebleu                         2.3.2
safetensors                       0.5.3
scipy                             1.13.1
seaborn                           0.13.0
semantic-version                  2.10.0
sentencepiece                     0.1.99
setuptools                        75.8.0
shellingham                       1.5.4
shortuuid                         1.0.11
six                               1.17.0
sniffio                           1.3.1
socksio                           1.0.0
starlette                         0.46.1
sympy                             1.13.3
tabulate                          0.9.0
tiktoken                          0.9.0
timm                              0.9.10
tokenizers                        0.19.1
tomlkit                           0.12.0
torch                             2.4.0+cu124
torchaudio                        2.4.0
torchvision                       0.19.0+cu124
tqdm                              4.66.1
transformers                      4.44.0
triton                            3.0.0
typer                             0.15.2
typing_extensions                 4.8.0
typing-inspection                 0.4.0
tzdata                            2025.2
ultralytics                       8.3.104
ultralytics-thop                  2.0.14
urllib3                           2.3.0
uvicorn                           0.24.0.post1
uvloop                            0.21.0
vllm                              0.5.4
vllm-flash-attn                   2.6.1
watchfiles                        1.0.4
websockets                        12.0
wheel                             0.45.1
xformers                          0.0.27.post2
xxhash                            3.5.0
yarl                              1.19.0
zipp                              3.21.0
zstandard                         0.23.0

备注 | Anything else?

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions