Skip to content

Sigmoid-table behavior in FastText, etc code is fishy #2725

Open
@gojomo

Description

Our implementation of FastText training error-backpropagation does some fishy things that deviate from the FB reference implementation.

For example, at..

https://github.com/RaRe-Technologies/gensim/blob/fbc7d0952f1461fb5de3f6423318ae33d87524e3/gensim/models/fasttext_inner.pyx#L338

...we simply short-circuit skip to the next loop when an exponent is out of the desired range. (The same approach appears in Word2Vec and Doc2Vec cython code, as well.)

However, the seemingly-analogous code in Facebook's FastText instead clips the values to 0.0/1.0 in these cases, allowing backprop to proceed. See:

https://github.com/facebookresearch/fastText/blob/26bcbfc6b288396bd189691768b8c29086c0dab7/src/loss.cc#L52

Our deviation from Facebook's code's practice is suspicious on both correctness & consistency grounds. This simple continue does however match the behavior we copied long-ago from word2vec.c.

Other perhaps-more superficial changes are that FB's code makes its lookup-tables 512 slots long instead of 1000, but allows exponents to 8 instead of 6:

https://github.com/facebookresearch/fastText/blob/51e6738d734286251b6ad02e4fdbbcfe5b679382/src/loss.cc#L16

Again, our FT implementation seems to have copied our copy-of-word2vec.c choices, instead of the reference FB implementation choices. If anything, it could make more sense to update the word2vec-derived code with these newer choices – as they at least plausibly represent practices improved by experience.

Metadata

Assignees

No one assigned

    Labels

    fasttextIssues related to the FastText model

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions