DeepLearning-LecNotes6


(Kobe430am) #21
def nll_loss_seq(inp, targ):
    sl,bs,nh = inp.size()
    targ = targ.transpose(0,1).contiguous().view(-1)
    return F.nll_loss(inp.view(-1,nh), targ)

I’m a bit puzzled, but pedagogically speaking, shouldn’t this be sl,bs,vocab = inp.size() instead?