Closed
Description
Hallo, I am trying to use NER-German-Large in an environment without internet connection. I downloaded the model to a local directory from here but I can not initialize it, since the corresponding TransformerWordEmbeddings seem to be missing locally.
How do I find out what files are missing exactly, where can I download them and how do I load and use them in my script?
Here is the code I am running:
from flair.models import SequenceTagger
tagger = SequenceTagger.load("../my_path/ner-german-large/pytorch_model.bin")
Output without internet connection:
Traceback (most recent call last):
File "my_file.py", line 104, in <module>
tagger = SequenceTagger.load("../my_path/ner-german-large/pytorch_model.bin")
File "/opt/conda/lib/python3.8/site-packages/flair/nn/model.py", line 101, in load
state = torch.load(f, map_location='cpu')
File "/opt/conda/lib/python3.8/site-packages/torch/serialization.py", line 592, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/opt/conda/lib/python3.8/site-packages/torch/serialization.py", line 851, in _load
result = unpickler.load()
File "/opt/conda/lib/python3.8/site-packages/flair/embeddings/token.py", line 1332, in __setstate__
embedding = TransformerWordEmbeddings(
File "/opt/conda/lib/python3.8/site-packages/flair/embeddings/token.py", line 838, in __init__
self.tokenizer: PreTrainedTokenizer = AutoTokenizer.from_pretrained(model, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 445, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1672, in from_pretrained
resolved_vocab_files[file_id] = cached_path(
File "/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py", line 1329, in cached_path
output_path = get_from_cache(
File "/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py", line 1552, in get_from_cache
raise ValueError(
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Loading the model from the local path works fine when internet connection is oactive but I have a hard time reconstructing the additional downloads that happen automatically.
Heres is the output I get with an active internet connection (takes just a few seconds):
2021-09-18 14:18:04,138 loading file ./ner-german-large/pytorch_model.bin
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 513/513 [00:00<00:00, 354kB/s]
Related #651