Loading word2vec model cannot be done with a reasonable memory capacity #239

SMMousaviSP · 2021-08-21T15:53:07Z

Hi,
I wanted to do augmentation based on word2vec similarity so I downloaded the word2vec model as said in the README file:

from nlpaug.util.file.download import DownloadUtil

DownloadUtil.download_word2vec(dest_dir='.') # Download word2vec model

A zip file was downloaded and I extracted it, then when I tried to load it with the code below it took too long and crashed because of not enough memory. I also tried to do this on Google Colab which gives me 12 GB of memory, but didn't work for the same reason.

import nlpaug.augmenter.word as naw

text = "Sample text to test augmentation"
aug = naw.WordEmbsAug(
    model_type='word2vec', model_path='GoogleNews-vectors-negative300.bin',
    action="substitute")
augmented_text = aug.augment(text)
print("Original:")
print(text)
print("Augmented Text:")
print(augmented_text)

The .bin file is 3.5 GB, why it's not working even with 12 GB of memory?

makcedward · 2021-10-05T06:10:03Z

Using gensim package to load files. Impoved loading speed and memory consumption. You may retry by getting the latest dev version (pip install gensim git+https://github.com/makcedward/nlpaug.git)

makcedward · 2021-10-19T04:17:26Z

Enhanced in 1.1.8 version

SMMousaviSP changed the title ~~Loading word2vec model can not been done with a reasonable memory capacity~~ Loading word2vec model cannot be done with a reasonable memory capacity Aug 21, 2021

makcedward added a commit that referenced this issue Oct 5, 2021

#239 Use gensim to load word2vec, glove and fasttext instead of using np

720babd

makcedward closed this as completed Oct 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading word2vec model cannot be done with a reasonable memory capacity #239

Loading word2vec model cannot be done with a reasonable memory capacity #239

SMMousaviSP commented Aug 21, 2021

makcedward commented Oct 5, 2021

makcedward commented Oct 19, 2021

Loading word2vec model cannot be done with a reasonable memory capacity #239

Loading word2vec model cannot be done with a reasonable memory capacity #239

Comments

SMMousaviSP commented Aug 21, 2021

makcedward commented Oct 5, 2021

makcedward commented Oct 19, 2021