Hello! I have this LSTM:
n_hidden = 128
n_classes = 2
bs = 1
def __init__(self, nl):
self.nl = nl
self.rnn = nn.LSTM(1, n_hidden, nl, bidirectional=True) #dropout=0.3,bidirectional=True)
self.l_out = nn.Linear(n_hidden*2, n_classes)
def forward(self, input):
outp,h = self.rnn(input.view(len(input), bs, -1), self.h)
#self.h = repackage_var(h)
def init_hidden(self, bs):
self.h = (V(torch.zeros(self.nl*2, bs, n_hidden)),
V(torch.zeros(self.nl*2, bs, n_hidden)))
and in the RNN lecture it was mentioned that we should initialize the hidden layer as the identity matrix. However in the case of my bidirectional RNN, when I do this:
I get a vector of size [512,128] (I am not sure where that 512 is coming from, I would have expected 128 x 2 = 256). How should I initialize the hidden state in this case? Thank you!
There are 4 components per cell. So its
4*hidden = 4*128 = 512
I am copy-pasting bits from the nn.LSTM doc.
weight_ih_l[k] : the learnable input-hidden weights of the :math:
(W_ii|W_if|W_ig|W_io), of shape
(4*hidden_size, input_size) for
k = 0.
Otherwise, the shape is
(4*hidden_size, num_directions * hidden_size)
weight_hh_l[k] : the learnable hidden-hidden weights of the :math:
(W_hi|W_hf|W_hg|W_ho), of shape
bias_ih_l[k] : the learnable input-hidden bias of the :math:
(b_ii|b_if|b_ig|b_io), of shape
bias_hh_l[k] : the learnable hidden-hidden bias of the :math:
(b_hi|b_hf|b_hg|b_ho), of shape
Oh I see. Thank you! So is there any good way to initialize this?
If you know which of the weights you want to initialize by name, you can do something like this:
Definitely look into
torch.nn.init —> it has some other builtin initializers like normal, uniform etc. (Btw, the underscore in
eye_ implies an inplace transformation)
Also, please look up/play with
rnn.named_parameters() to programmatically iterate over the weights.
I don’t think the eye_ would work, because as far as I understand that creates an identity matrix, which should be a square matrix, while my matrix is 512 x 128. But I will look into torch.nn.init. Thank you!