This repository was archived by the owner on Aug 18, 2021. It is now read-only.
This repository was archived by the owner on Aug 18, 2021. It is now read-only.
Issues in your tutorial on Classifying Names with a Character-Level RNN #134
Open
Description
Looking at the following diagram and the code you wrote , which is :
import torch.nn as nn
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(input_size + hidden_size, output_size)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, input, hidden):
combined = torch.cat((input, hidden), 1)
hidden = self.i2h(combined)
output = self.i2o(combined)
output = self.softmax(output)
return output, hidden
def initHidden(self):
return torch.zeros(1, self.hidden_size)
n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)
I believe there are two issues here if you were trying to model vanilla RNN which its formulation is as follows:
which are based on Elman(vanilla RNN) network :
-
First it does not introduce a nonlinear transformation while calculating the new hidden state,
-
and Second it dose not consider the
$h_t$ for the calculation of the output!
So my question is, were you implementing the Elman network or was it a completely new variation of the RNN?
In case I'm wrong what am I missing here?
If you could kindly clarify this, I'd appreciate it .
Metadata
Metadata
Assignees
Labels
No labels