dm_concat mode & corpus with '\0'
tokens gives error #684
Open
Description
See https://groups.google.com/d/msg/gensim/8r0GOGif56U/KJ4mmQo6KQAJ
The creation of the null-word ignores whether there were any in the corpus, so seems to be clobbering necessary info for the case where '\0'
does need to be a predicted-word.
As this is a final step, it can probably notice that such a word already exists, and perhaps log a warning that the same token is playing two roles (whatever it is in the corpus, plus the special plug null_word value).