You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If some captions contains '!' and shorter than max_length, the embedding of token '!' and 'pad' will be exactly same because the token embedding method uses nn.embedding.
Thanks to your code, I am growing every day. Thank you very much.
In every dataloader, the special tokens are initialized below.
But I found that [MASK], [UNK], and [PAD] are not used in the code. But the problem happens when adding just zero as pad token like below.
In vocab, there is no number for [PAD]. Token id '0' is paired with '!'.
If some captions contains '!' and shorter than max_length, the embedding of token '!' and 'pad' will be exactly same because the token embedding method uses nn.embedding.
Example
I think there is no way to differentiate between the two captions.
The text was updated successfully, but these errors were encountered: