Bucket Argument in fasttext not working as expected ?

Hi, For the fasttext native from gensim:

My understanding is that according to the hashing trick, if bucket is < total # of subwords, there will be collisions and some subwords will be mapped to the same integers.  Am I wrong? 
However, it is not what I see on a toy example: 

```python
import gensim
from gensim.models.fasttext import FastText

sent = [['lol', 'dds', 'sdsf'], ['anticonsti']]
model = FastText(min_count = 1, bucket = 20)
model.build_vocab(sentences=sent)
model.train(sentences = sent, epochs = 1, report_delay = 1.0)

model.wv.ngrams
```
#### Expected Results
Dictionary with ngrams and their mappings to integers between 0 and 19 ( buckets = 20)

#### Actual Results
Dictionary with ngrams and their mappings to integers between 0 and 55 ( number of ngrams is 56 here)

#### Versions
>>> import platform; print(platform.platform())
Windows-10-10.0.14393-SP0
>>> import sys; print("Python", sys.version)
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.13.3
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.0.0
>>> import gensim; print("gensim", gensim.__version__)
gensim 3.1.0
>>> from gensim.models import word2vec;print("FAST_VERSION", word2vec.FAST_VERSION)
FAST_VERSION 0
>>> 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bucket Argument in fasttext not working as expected ? #1765

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development