Skip to content

Cannot load NER model from local file #2445

Closed
@felixvor

Description

@felixvor

Hallo, I am trying to use NER-German-Large in an environment without internet connection. I downloaded the model to a local directory from here but I can not initialize it, since the corresponding TransformerWordEmbeddings seem to be missing locally.

How do I find out what files are missing exactly, where can I download them and how do I load and use them in my script?

Here is the code I am running:

from flair.models import SequenceTagger
tagger = SequenceTagger.load("../my_path/ner-german-large/pytorch_model.bin")

Output without internet connection:

Traceback (most recent call last):
  File "my_file.py", line 104, in <module>
    tagger = SequenceTagger.load("../my_path/ner-german-large/pytorch_model.bin")
  File "/opt/conda/lib/python3.8/site-packages/flair/nn/model.py", line 101, in load
    state = torch.load(f, map_location='cpu')
  File "/opt/conda/lib/python3.8/site-packages/torch/serialization.py", line 592, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/opt/conda/lib/python3.8/site-packages/torch/serialization.py", line 851, in _load
    result = unpickler.load()
  File "/opt/conda/lib/python3.8/site-packages/flair/embeddings/token.py", line 1332, in __setstate__
    embedding = TransformerWordEmbeddings(
  File "/opt/conda/lib/python3.8/site-packages/flair/embeddings/token.py", line 838, in __init__
    self.tokenizer: PreTrainedTokenizer = AutoTokenizer.from_pretrained(model, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 445, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1672, in from_pretrained
    resolved_vocab_files[file_id] = cached_path(
  File "/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py", line 1329, in cached_path
    output_path = get_from_cache(
  File "/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py", line 1552, in get_from_cache
    raise ValueError(
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

Loading the model from the local path works fine when internet connection is oactive but I have a hard time reconstructing the additional downloads that happen automatically.
Heres is the output I get with an active internet connection (takes just a few seconds):

2021-09-18 14:18:04,138 loading file ./ner-german-large/pytorch_model.bin
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 513/513 [00:00<00:00, 354kB/s]

Related #651

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions