Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deutsch to English Translation Model by Google doesn't work anymore... #7761

Closed
avacaondata opened this issue Oct 13, 2020 · 9 comments
Closed

Comments

@avacaondata
Copy link

Hi, the model in: https://huggingface.co/google/bert2bert_L-24_wmt_de_en doesn't work anymore. It seems that the library has changed a lot since the model was added, therefore the classes themselves seem to have changed in names etc.
Can anyone tell me how could I apply with the current library functionality?
Thanks in advance! :)

@thomwolf
Copy link
Member

Do you want to post the full error message? (and the information asked in the issue template)

@LysandreJik
Copy link
Member

These models have been added last months, so they shouldn't have changed much. The full issue template filled would be very helpful here!

@avacaondata
Copy link
Author

avacaondata commented Oct 14, 2020

This is the error:
ValueError: Unrecognized model identifier: bert-generation. Should contain one of retribert, t5, mobilebert, distilbert, albert, camembert, xlm-roberta, pegasus, marian, mbart, bart, reformer, longformer, roberta, flaubert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm, ctrl, electra, encoder-decoder

Using exactly the code appearing in the link I passed:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google/bert2bert_L-24_wmt_de_en", pad_token="<pad>", eos_token="</s>", bos_token="<s>")
model = AutoModelForSeq2SeqLM.from_pretrained("google/bert2bert_L-24_wmt_de_en")

sentence = "Willst du einen Kaffee trinken gehen mit mir?"

input_ids = tokenizer(sentence, return_tensors="pt", add_special_tokens=False).input_ids
output_ids = model.generate(input_ids)[0]
print(tokenizer.decode(output_ids, skip_special_tokens=True))

Transformers version: 3.1.0

@LysandreJik
Copy link
Member

@patrickvonplaten what should be done here? The BertGeneration model cannot be loaded directly through the AutoModelForSeq2SeqLM auto-model, can it?

@avacaondata
Copy link
Author

Then how could I load it?

@patrickvonplaten
Copy link
Contributor

Hey @alexvaca0 - the google/encoder-decoder models were released in transformers 3.2.0 => so you will have to update your transformers version for it :-) It should then work as expected.

@avacaondata
Copy link
Author

Ohhh so sorry, my bad :( Thanks a lot for the quick response! :)

@ezekielbarnett
Copy link

ezekielbarnett commented Dec 4, 2020

I think this may not have been fully resolved? I'm getting a simmilar error:

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/modeling_utils.py", line 926, in from_pretrained
    state_dict = torch.load(resolved_archive_file, map_location="cpu")
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 527, in load
    with _open_zipfile_reader(f) as opened_zipfile:
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 224, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /opt/conda/conda-bld/pytorch_1579022060824/work/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /opt/conda/conda-bld/pytorch_1579022060824/work/caffe2/serialize/inline_container.cc:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7fd86f9d2627 in /home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamReader::init() + 0x1f5b (0x7fd82fbbb9ab in /home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x64 (0x7fd82fbbcbc4 in /home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x6d2146 (0x7fd87067e146 in /home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x28ba06 (0x7fd870237a06 in /home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #37: __libc_start_main + 0xe7 (0x7fd87474cb97 in /lib/x86_64-linux-gnu/libc.so.6)


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "wmt_test.py", line 26, in <module>
    model = AutoModelForSeq2SeqLM.from_pretrained("google/bert2bert_L-24_wmt_de_en").to(device)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/modeling_auto.py", line 1073, in from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/modeling_utils.py", line 929, in from_pretrained
    "Unable to load weights from pytorch checkpoint file. "
OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

python = 3.76, torch==1.4.0, transformers==3.2.0

@LysandreJik
Copy link
Member

Hi @ezekielbarnett could you open a new issue and fill the issue template? A reproducible code example would be particularly helpful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants