Nll_loss_seq lesson 6 question

(Fabio Graetz) #1

Hey guys,

I have a question concerning the negative log likelyhood loss function defined in lesson 6:

def nll_loss_seq(inp, targ):
    sl,bs,nh = inp.size()
    targ = targ.transpose(0,1).contiguous().view(-1)
    return F.nll_loss(inp.view(-1,nh), targ)

The output of the model has the size torch.Size([8, 512, 85]) (8 timesteps, bs = 512 and 85 being the embedding size).

So is it truly nh in the loss function or shouldn’t it actually be n_fac=85. Of course it does not change anything as it is only a variable… but for understanding purposes.

What do you think?



(魏璎珞) #2

notation in the function is slightly confusing, but you are right. though in the lecture later it was explained your way. i had to do a double take as well


In the example in the lesson, 85 is vocab_size, 42 is n_fac. In nll_loss_seq, nh is vocab_size; the function collapses any dimension of (sl, bs) into (sl * bs, ) before passing them onto F.nll_loss.