Discussion:
[gensim:11786] AttributeError when loading GoogleNews-vectors with
h***@diqa-pm.com
2018-11-16 09:37:01 UTC
Permalink
Hi there,

thank you for the superb gensim-module which we find extremely helpful for
our text analytics projects!

We stumbled upon an issue when we load the pre-trained google news word
vectors:

we are unable to add random word vectors to the map because the
"add"-method is not provided.

Find my code snippet below. Thanks again.
Daniel

# -*- coding: utf-8 -*-
"""
Created on Fri Nov 16 10:46:58 2018

@author: hansch
"""
import pysvn
import numpy as np
import gensim

#assuming that GoogleNews-vectors-negative300.bin is extracted to local
disk drive:
# URL:
https://github.com/mmihaltz/word2vec-GoogleNews-vectors/blob/master/GoogleNews-vectors-negative300.bin.gz
# local path: D:/training-data/word2vec/GoogleNews-vectors-negative300.bin
googleNewsModel =
gensim.models.KeyedVectors.load_word2vec_format('D:/training-data/word2vec/GoogleNews-vectors-negative300.bin',
binary=True) # type: class 'gensim.models.keyedvectors.Word2VecKeyedVectors
googleNewsModel.add(list('unknown word'), list(np.random.random(300))) #
add the token and the random wordvector to the google News word vectors

#returns error:
# AttributeError: 'Word2VecKeyedVectors' object has no attribute 'add'
--
You received this message because you are subscribed to the Google Groups "Gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Gordon Mohr
2018-11-16 19:13:21 UTC
Permalink
`KeyedVectors` doesn't support dynamic addition/removal or individual
words/word-vectors.

If you need that, you'd have to consult the source and write your own code
to support it. (Making individual additions would be slow and awkward given
its current implementation, but adding words in large batches could make
sense.)

If you did have a specific batch of words to add, it might be easiest to
just write the existing vectors to a relatively easy-to-handle disk format,
such as via `.save_word2vec_format(filename, binary=False)`, then edit that
file on disk to include your extra words/word-vectors - by both appending
extra lines with the proper values, and changing the leading declaration of
the number of items included. Then, re-loading the full set.

- Gordon
Post by h***@diqa-pm.com
Hi there,
thank you for the superb gensim-module which we find extremely helpful for
our text analytics projects!
We stumbled upon an issue when we load the pre-trained google news word
we are unable to add random word vectors to the map because the
"add"-method is not provided.
Find my code snippet below. Thanks again.
Daniel
# -*- coding: utf-8 -*-
"""
Created on Fri Nov 16 10:46:58 2018
@author: hansch
"""
import pysvn
import numpy as np
import gensim
#assuming that GoogleNews-vectors-negative300.bin is extracted to local
https://github.com/mmihaltz/word2vec-GoogleNews-vectors/blob/master/GoogleNews-vectors-negative300.bin.gz
# local path: D:/training-data/word2vec/GoogleNews-vectors-negative300.bin
googleNewsModel =
gensim.models.KeyedVectors.load_word2vec_format('D:/training-data/word2vec/GoogleNews-vectors-negative300.bin',
binary=True) # type: class 'gensim.models.keyedvectors.Word2VecKeyedVectors
googleNewsModel.add(list('unknown word'), list(np.random.random(300))) #
add the token and the random wordvector to the google News word vectors
# AttributeError: 'Word2VecKeyedVectors' object has no attribute 'add'
--
You received this message because you are subscribed to the Google Groups "Gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...