-
Notifications
You must be signed in to change notification settings - Fork 26.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model loading the checkpoint error #20
Comments
But I print the model.embeddings.token_type_embeddings it was Embedding(16,768) . |
which model are you loading? |
the pre-trained model chinese_L-12_H-768_A-12 |
mycode: The error: |
I'm testing the chinese model. |
In the 'config.json' of the chinese_L-12_H-768_A-12 ,the type_vocab_size=2.But I change the config.type_vocab_size=16, it still error. |
{ I change my code: it still error. |
Yes,but I change it in my code. |
I think it's good. |
Ok, I have the models. I think |
I have no idea.Did my model make the wrong convert? |
I am testing that right now. I haven't played with the multi-lingual models yet. |
I also use it for the first time.I am looking forward to your test results. |
When I was converting the model . Traceback (most recent call last): |
are you supplying a config file with |
I used the 'bert_config.json' of the chinese_L-12_H-768_A-12 when I was converting . |
Ok, I think I found the issue, your BertConfig is not build from the configuration file for some reason and thus use the default value of This error happen on my system when I use I will make sure these two ways of initializing the configuration file (from parameters or from json file) cannot be messed up. |
add quac codalab submission pipeline (cont.)
i have the same problem as you. did you solve the problem? |
…2023-01-11 IFU 2023-01-11
iter.next() to next(iter)
RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.
The text was updated successfully, but these errors were encountered: