Description
Please excuse me for asking this question here since it's not really actual issue regarding gensim.
TL;DR:
I'd like to know how I can get to the word vectors before they are getting propagated in order to apply transformations on them while training paragraph/document vectors.
What I'm trying to do is make a modification to train_cbow_pair
in gensim.models.Word2Vec
. However, I struggle a bit to understand what's exactly happening there.
I get, that l1
is ist the sum of the current context window of words plus the sum of document-tag vectors that is passed to train_cbow_pair
.
def train_cbow_pair(model, word, input_word_indices, l1, alpha, learn_vectors=True, learn_hidden=True):
neu1e = np.zeros(l1.shape)
if model.hs:
l2a = model.syn1[word.point] # 2d matrix, codelen x layer1_size
fa = 1. / (1. + np.exp(-np.dot(l1, l2a.T))) # propagate hidden -> output
ga = (1. - word.code - fa) * alpha # vector of error gradients multiplied by the learning rate
if learn_hidden:
model.syn1[word.point] += np.outer(ga, l1) # learn hidden -> output
neu1e += np.dot(ga, l2a)
# ...
Here I'm not sure what I'm looking at. In particular I struggle to understand the line
l2a = model.syn1[word.point]
I don't know what word.point
is describing here and why thisinput is getting propagated.
Does this provide the word vectors for activating the hidden layer - which appears to be fa
in that case? But this can't actually be the case since word
is actually just the current word of the context window if I get that right:
# train_document_dm calling train_cbow_pair
def train_document_dm(model, doc_words, doctag_indexes, alpha, work=None, neu1=None,
learn_doctags=True, learn_words=True, learn_hidden=True,
word_vectors=None, word_locks=None, doctag_vectors=None, doctag_locks=None):
# ...
for pos, word in enumerate(word_vocabs):
# ...
neu1e = train_cbow_pair(model, word, word2_indexes, l1, alpha,
learn_vectors=False, learn_hidden=learn_hidden)
# ...
So what I'd like to know is actually how I can get to the word vectors before they are getting propagated in order to apply transformations on them beforehand.