Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Can't load the model for 'gpt-cpm-small-cn-distill' #8230

Closed
lao-xu opened this issue Apr 4, 2024 · 11 comments
Closed

[Question]: Can't load the model for 'gpt-cpm-small-cn-distill' #8230

lao-xu opened this issue Apr 4, 2024 · 11 comments
Assignees
Labels
question Further information is requested stale

Comments

@lao-xu
Copy link

lao-xu commented Apr 4, 2024

请提出你的问题

from paddlenlp.transformers import GPTChineseTokenizer, GPTLMHeadModel

model_name = "gpt-cpm-small-cn-distill"

tokenizer = GPTChineseTokenizer.from_pretrained(model_name)
model = GPTLMHeadModel.from_pretrained(model_name)

why say "OSError: Can't load the model for 'gpt-cpm-small-cn-distill'. If you were trying to load it from 'https://paddlenlp.bj.bcebos.com/'"?

@lao-xu lao-xu added the question Further information is requested label Apr 4, 2024
@w5688414
Copy link
Contributor

已修复,或者回退到2.6版本试一下。
#8253

@lao-xu
Copy link
Author

lao-xu commented Apr 11, 2024

已修复,或者回退到2.6版本试一下。 #8253

还是不行吧。
from paddlenlp.transformers import GPTChineseTokenizer, GPTLMHeadModel

model_name = "gpt-cpm-small-cn-distill"

tokenizer = GPTChineseTokenizer.from_pretrained(model_name)
model = GPTLMHeadModel.from_pretrained(model_name)
model.eval()

inputs = "花间一壶酒,独酌无相亲。举杯邀明月,"
inputs_ids = tokenizer(inputs)["input_ids"]
inputs_ids = paddle.to_tensor(inputs_ids, dtype="int64").unsqueeze(0)

outputs, _ = model.generate(input_ids=inputs_ids, max_length=10, decode_strategy="greedy_search", use_fast=True)

result = tokenizer.convert_ids_to_string(outputs[0].numpy().tolist())

print("Model input:", inputs)
print("Result:", result)


Traceback (most recent call last)
File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:1581, in PretrainedModel._resolve_model_file_path(cls, pretrained_model_name_or_path, from_hf_hub, from_aistudio, cache_dir, subfolder, config, convert_from_torch, use_safetensors, variant)
1579 if pretrained_model_name_or_path in cls.pretrained_init_configuration:
1580 # fetch the weight url from the pretrained_resource_files_map
-> 1581 resource_file_url = cls.pretrained_resource_files_map["model_state"][
1582 pretrained_model_name_or_path
1583 ]
1584 resolved_archive_file = cached_file(
1585 resource_file_url, _add_variant(PADDLE_WEIGHTS_NAME, variant), **cached_file_kwargs
1586 )

KeyError: 'model_state'

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last)
Cell In[3], line 6
3 model_name = "gpt-cpm-small-cn-distill"
5 tokenizer = GPTChineseTokenizer.from_pretrained(model_name)
----> 6 model = GPTLMHeadModel.from_pretrained(model_name)
7 model.eval()
9 inputs = "花间一壶酒,独酌无相亲。举杯邀明月,"

File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:2116, in PretrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *args, **kwargs)
2113 use_keep_in_fp32_modules = False
2115 # resolve model_weight file
-> 2116 resolved_archive_file, sharded_metadata, is_sharded = cls._resolve_model_file_path(
2117 pretrained_model_name_or_path,
2118 cache_dir=cache_dir,
2119 subfolder=subfolder,
2120 from_hf_hub=from_hf_hub,
2121 from_aistudio=from_aistudio,
2122 config=config,
2123 convert_from_torch=convert_from_torch,
2124 use_safetensors=use_safetensors,
2125 variant=variant,
2126 )
2128 # load pt weights early so that we know which dtype to init the model under
2129 if not is_sharded and state_dict is None:
2130 # Time to load the checkpoint

File /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py:1638, in PretrainedModel._resolve_model_file_path(cls, pretrained_model_name_or_path, from_hf_hub, from_aistudio, cache_dir, subfolder, config, convert_from_torch, use_safetensors, variant)
1636 logger.info(e)
1637 # For any other exception, we throw a generic error.
-> 1638 raise EnvironmentError(
1639 f"Can't load the model for '{pretrained_model_name_or_path}'. If you were trying to load it"
1640 " from 'https://paddlenlp.bj.bcebos.com/'"
1641 )
1643 if is_local:
1644 logger.info(f"Loading weights file {archive_file}")

OSError: Can't load the model for 'gpt-cpm-small-cn-distill'. If you were trying to load it from 'https://paddlenlp.bj.bcebos.com/'

@w5688414
Copy link
Contributor

我测了几次都没有问题:

image

@lao-xu
Copy link
Author

lao-xu commented Apr 12, 2024

我在aistduio上跑的,一直报这个错,这云环境有关系吗?

@w5688414
Copy link
Contributor

查看一下你的paddle版本,可以装develop版本试一下

@w5688414 w5688414 assigned w5688414 and lugimzzz and unassigned lugimzzz Apr 13, 2024
@yangquanbiubiu
Copy link

我是在昆仑芯R200上跑的,也碰到了同样的问题,请问你解决了吗

@lao-xu
Copy link
Author

lao-xu commented Apr 16, 2024

我是在昆仑芯R200上跑的,也碰到了同样的问题,请问你解决了吗
没有,我升级了paddlenlp版本还是不行,不知道为什么

@w5688414
Copy link
Contributor

按照如下的命令试一下:

pip uninstall paddlenlp
git clone https://github.com/PaddlePaddle/PaddleNLP.git
cd PaddleNLP
pip install -e .

Copy link

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

@github-actions github-actions bot added stale and removed stale labels Jun 17, 2024
Copy link

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label Aug 23, 2024
Copy link

github-actions bot commented Sep 6, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

4 participants