Hi all,
Is there a discrepancy in where h.detach happens in language model 4 and 5?
def forward_LM4(self, x):
outs = []
for i in range(sl):
self.h = self.h + self.i_h(x[:,i])
self.h = F.relu(self.h_h(self.h))
outs.append(self.h_o(self.h))
self.h = self.h.detach()
return torch.stack(outs, dim=1)
def forward_LM5(self, x):
res,h = self.rnn(self.i_h(x), self.h)
self.h = h.detach()
return self.h_o(res)
In LM4, self.h_o() happens before detach. But in LM5, self.h_o() happens after detach. Does LM5 have a bug? I think the detach should happen after self.h_o().
Thanks,
Daniel Lam