I am trying to train a Conv Lstm model with the fastai API.
After starting training I get the error:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Do I need to do something special with the LSTM layer to be able to train it? A Callback? to reset the hidden state on each loop?
I am very new to LSTM and recurrent nets in general.
btw, this is my model:
class ConvLSTM(Module): def __init__( self, future_steps, latent_dim=512, lstm_layers=1, hidden_dim=1024, bidirectional=True, attention=True ): self.encoder = Encoder(3, latent_dim) self.lstm = LSTM(latent_dim, lstm_layers, hidden_dim, bidirectional) self.output_layers = nn.Sequential( nn.Linear(2 * hidden_dim if bidirectional else hidden_dim, hidden_dim), nn.BatchNorm1d(hidden_dim, momentum=0.01), nn.ReLU(), nn.Linear(hidden_dim, future_steps)) self.attention = attention self.attention_layer = nn.Linear(2 * hidden_dim if bidirectional else hidden_dim, 1) def reset(self): self.lstm.reset() def forward(self, x): batch_size, seq_length, c, h, w = x.shape x = x.view(batch_size * seq_length, c, h, w) x = self.encoder(x) x = x.view(batch_size, seq_length, -1) x = self.lstm(x) if self.attention: attention_w = F.softmax(self.attention_layer(x).squeeze(-1), dim=-1) x = torch.sum(attention_w.unsqueeze(-1) * x, dim=1) else: x = x[:, -1] return self.output_layers(x)
I am encoding frames of a video, with the Encoder, and passing this over the LSTM layer.