Skip to content

[BUG] 加载lora微调后的模型失效 #1130

Closed
@jackaihfia2334

Description

通过ChatGLM-Efficient-Tuning项目微调了chatglm2-6b,并通过该项目的export方式导出merged后的模型chapi
修改config中对应的信息,进行加载,报如下warning, 加载成功后,发现微调的效果并不起效。而通过ChatGLM-Efficient-Tuning调用微调后的模型是起效的。

————————————————————————————————————————
warner(
Some weights of the model checkpoint at /data2/model/chapi were not used when initializing ChatGLMForConditionalGeneration: ['lm_head.weight']

  • This IS expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    2023-08-16 14:24:55 | INFO | model_worker | Register to controller
    2023-08-16 14:24:56 | INFO | controller | Register a new worker: http://127.0.0.1:20002
    2023-08-16 14:24:56 | INFO | controller | Register done: http://127.0.0.1:20002, {'model_names': ['chapi'], 'speed': 1, 'queue_length': 0}
    2023-08-16 14:24:56 | INFO | stdout | INFO: 127.0.0.1:52242 - "POST /register_worker HTTP/1.1" 200 OK
    2023-08-16 14:24:56 | ERROR | stderr | INFO: Started server process [1426]
    2023-08-16 14:24:56 | ERROR | stderr | INFO: Waiting for application startup.

————————————————————————————————————————————
我做的是self_cognation的微调,让它输出自己的名字是查派
而直接通过AutoModel调用虽然也是报相同的warning,却能输出微调后预期的结果
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("/data2/model/chapi", trust_remote_code=True) model = AutoModel.from_pretrained("/data2/model/chapi", trust_remote_code=True).half().cuda() model = model.eval() response, history = model.chat(tokenizer, "你好", history=[]) print(response)

输出:您好!我是 查派,由 xxx开发,旨在为用户提供智能化的回答和支持。

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions