Skip to content

save_facebook_model() - AssertionError #2853

Closed
@imendibo

Description

Problem description

I am trying to save the trained model of fasttext using the new save_facebook_model function.
I was unable to do it so because an assertionError arises in the code line:
assert vocab_n == len(model.wv.vocab)
The vocabulary of my model is of 2000264:

len(model.wv.vocab)
2000264

I tried with a model with a vocabulary of 4500 and it worked. So I guess there is a limitation in that. But the error message did not tell any of that.

Steps/code/corpus to reproduce

from gensim.models.fasttext import save_facebook_model

save_facebook_model(model,'own_fasttext_model_pretrained.bin')

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-201-0a3c1c458b74> in <module>
      2 from gensim.models.fasttext import load_facebook_model, load_facebook_vectors,save_facebook_model
      3 
----> 4 save_facebook_model(model,'own_fasttext_model_pretrained.bin')

/opt/conda/lib/python3.7/site-packages/gensim/models/fasttext.py in save_facebook_model(model, path, encoding, lr_update_rate, word_ngrams)
   1334     """
   1335     fb_fasttext_parameters = {"lr_update_rate": lr_update_rate, "word_ngrams": word_ngrams}
-> 1336     gensim.models._fasttext_bin.save(model, path, fb_fasttext_parameters, encoding)

/opt/conda/lib/python3.7/site-packages/gensim/models/_fasttext_bin.py in save(model, fout, fb_fasttext_parameters, encoding)
    666     if isinstance(fout, str):
    667         with open(fout, "wb") as fout_stream:
--> 668             _save_to_stream(model, fout_stream, fb_fasttext_parameters, encoding)
    669     else:
    670         _save_to_stream(model, fout, fb_fasttext_parameters, encoding)

/opt/conda/lib/python3.7/site-packages/gensim/models/_fasttext_bin.py in _save_to_stream(model, fout, fb_fasttext_parameters, encoding)
    629 
    630     # Save words and ngrams vectors
--> 631     _input_save(fout, model)
    632     fout.write(struct.pack('@?', False))  # Save 'quot_', which is False for unsupervised models
    633 

/opt/conda/lib/python3.7/site-packages/gensim/models/_fasttext_bin.py in _input_save(fout, model)
    573 
    574     assert vocab_dim == ngrams_dim
--> 575     assert vocab_n == len(model.wv.vocab)
    576     assert ngrams_n == model.wv.bucket
    577 

AssertionError: 

Versions

Python 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Linux-4.9.0-12-amd64-x86_64-with-debian-9.12
Python 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48) 
[GCC 7.3.0]
NumPy 1.18.1
SciPy 1.4.1
gensim 3.8.3
FAST_VERSION 0

Metadata

Assignees

No one assigned

    Labels

    bugIssue described a bug

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions