-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many-to-many variable-length sequence labeling (such as POS) #3916
Comments
I ran across this problem as well. I am still not sure why this is the case and if this is the desired behaviour, but I did manage to get around it by putting all my output values in separate arrays, i.e.:
See also #3855, which is about a different sequence to sequence learning with variable length problem, but also mentions this issue. |
@dieuwkehupkes thanks for the hint! Turns out one-hot encoding is needed. And for people who have similar issues, you can solve the problem by creating and feeding the 3-d y_pad_one_hot into the previous model import numpy as np
from keras.utils.np_utils import to_categorical
# y_pad_one_hot.shape: (M, N, nb_classes)
y_pad_one_hot = np.array([to_categorical(sent_label, nb_classes=nb_classes) for sent_label in y_pad])
model.fit(X_pad, y_pad_one_hot) Still need to find the best way to mask the padding, though. |
@shuaiw can you provide the detailed value of "nb_classes" and "num_class". I encountered the same problem , please help! |
@yangxiufengsia num_class/nb_classes is the number of classes. |
@shuaiw If the output is a set of words, the num_class becomes the vocab_size. Assuming that I am expecting an output of 20 words, a one hot encoded Y becomes [vocab_size, max_output_words]. Is this correct? |
Been following some related threads, such as #395, #2654, and #2403, but still cannot sort out how to get it to work. The Keras API doc is already very dated so it's not very helpful for this issue.
So I want to use a pretrained word2vec word presentation + Keras LSTM to do POS tagging.
My first question is: is there a better way to feed in the pretrained vector presentation than the
embedding_weights
method mentioned at #853?Say we embed using the method mentioned in #853, and get a (M+2) by N embedding matrix. We also pad the variable-length sentences. Then we have
where M is the number of sentences in the corpus (in my case 18421), and N is padded sentence length (originals vary from 15-140 so in this case N=140)
Here is how I initialized the model
When I run
model.fit(X_pad, y_pad)
, I got this error:Been stuck here for a while. Any suggestion is appreciated!
The text was updated successfully, but these errors were encountered: