I am trying to train a Conv Lstm model with the fastai API.
After starting training I get the error:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Do I need to do something special with the LSTM layer to be able to train it? A Callback? to reset the hidden state on each loop?
I am very new to LSTM and recurrent nets in general.
btw, this is my model:
class ConvLSTM(Module):
def __init__(
self, future_steps, latent_dim=512, lstm_layers=1, hidden_dim=1024, bidirectional=True, attention=True
):
self.encoder = Encoder(3, latent_dim)
self.lstm = LSTM(latent_dim, lstm_layers, hidden_dim, bidirectional)
self.output_layers = nn.Sequential(
nn.Linear(2 * hidden_dim if bidirectional else hidden_dim, hidden_dim),
nn.BatchNorm1d(hidden_dim, momentum=0.01),
nn.ReLU(),
nn.Linear(hidden_dim, future_steps))
self.attention = attention
self.attention_layer = nn.Linear(2 * hidden_dim if bidirectional else hidden_dim, 1)
def reset(self): self.lstm.reset()
def forward(self, x):
batch_size, seq_length, c, h, w = x.shape
x = x.view(batch_size * seq_length, c, h, w)
x = self.encoder(x)
x = x.view(batch_size, seq_length, -1)
x = self.lstm(x)
if self.attention:
attention_w = F.softmax(self.attention_layer(x).squeeze(-1), dim=-1)
x = torch.sum(attention_w.unsqueeze(-1) * x, dim=1)
else:
x = x[:, -1]
return self.output_layers(x)
I am encoding frames of a video, with the Encoder, and passing this over the LSTM layer.