Discussion:
[gensim:11891] Re: Bug report: rounding error in gensim.models.word2vec.py
Gordon Mohr
2018-12-10 18:12:10 UTC
Permalink
Thanks for the report! This was observed before and a fix applied that we
thought resolved the issue (solving an available test case).
See: https://github.com/RaRe-Technologies/gensim/issues/865

Can you say more about the OS and versions of Python & gensim in which you
saw this? How large is your vocabulary, and if requested could you share
the raw sequence of word tallies so someone could reproduce the error
elsewhere?

- Gordon
cumulative += self.vocab[self.index2word[word_index]].count**power /
train_words_pow
self.cum_table[word_index] = round(cumulative * domain)
assert self.cum_table[-1] == domain
After the loop, cumulative is supposed to be 1. However, if vocab_size is
big enough (or something like that), floating point arithmetic
In my case, self.cum_table[-1] equals 2147483646 and domain equals
2147483647.
Best regards, Matej
--
You received this message because you are subscribed to the Google Groups "Gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...