e***@gmail.com
2017-03-27 13:43:52 UTC
sentences = Text8Corpus('/home/prakhar/text8')
phrases = Phrases(Text8Corpus('/home/prakhar/text8'), min_count=1, threshold
=2)
bigram = Phraser(phrases)
model = models.word2vec.Word2Vec(bigram[sentences], size=200,workers=4,
min_count=1)
The logger info while running this code-
2017-03-27 18:33:23,366 : INFO : training model with 4 workers on 677776
vocabulary and 200 features, using sg=0 hs=0 sample=0.001 negative=5 window=
5
2017-03-27 18:33:24,319 : INFO : expecting 1701 sentences, matching count
from corpus used for vocabulary survey
2017-03-27 18:33:25,170 : WARNING : train() called with an empty iterator (
if not intended, be sure to provide a corpus that offers restartable
iteration = an iterable).
Clearly, it is not desirable as can be seen here -
model.wv.most_similar(positive=['woman', 'king'], negative=['man'])
[(u'davies_welsh', 0.3605641722679138),
(u'add_ins', 0.3399544656276703),
(u'kings_landing', 0.3140672445297241),
(u'the_cordillera', 0.30870741605758667),
(u'giant_anteater', 0.30382204055786133),
(u'analog_clocks', 0.30148613452911377),
(u'back_together', 0.30050382018089294),
(u'ionych', 0.2958505153656006),
(u'be_true', 0.29267528653144836),
(u'particle_physicists', 0.2917472720146179)]
phrases = Phrases(Text8Corpus('/home/prakhar/text8'), min_count=1, threshold
=2)
bigram = Phraser(phrases)
model = models.word2vec.Word2Vec(bigram[sentences], size=200,workers=4,
min_count=1)
The logger info while running this code-
2017-03-27 18:33:23,366 : INFO : training model with 4 workers on 677776
vocabulary and 200 features, using sg=0 hs=0 sample=0.001 negative=5 window=
5
2017-03-27 18:33:24,319 : INFO : expecting 1701 sentences, matching count
from corpus used for vocabulary survey
2017-03-27 18:33:25,170 : WARNING : train() called with an empty iterator (
if not intended, be sure to provide a corpus that offers restartable
iteration = an iterable).
Clearly, it is not desirable as can be seen here -
model.wv.most_similar(positive=['woman', 'king'], negative=['man'])
[(u'davies_welsh', 0.3605641722679138),
(u'add_ins', 0.3399544656276703),
(u'kings_landing', 0.3140672445297241),
(u'the_cordillera', 0.30870741605758667),
(u'giant_anteater', 0.30382204055786133),
(u'analog_clocks', 0.30148613452911377),
(u'back_together', 0.30050382018089294),
(u'ionych', 0.2958505153656006),
(u'be_true', 0.29267528653144836),
(u'particle_physicists', 0.2917472720146179)]
--
You received this message because you are subscribed to the Google Groups "gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.