Hey guys,
I have a question concerning the negative log likelyhood loss function defined in lesson 6:
def nll_loss_seq(inp, targ):
sl,bs,nh = inp.size()
targ = targ.transpose(0,1).contiguous().view(-1)
return F.nll_loss(inp.view(-1,nh), targ)
The output of the model has the size torch.Size([8, 512, 85])
(8 timesteps, bs = 512 and 85 being the embedding size).
So is it truly nh
in the loss function or shouldn’t it actually be n_fac
=85. Of course it does not change anything as it is only a variable… but for understanding purposes.
What do you think?
Thanks
Fabio