My interpretation is that, we are feeding torch’s nll_loss the output of model and true lables. And as per my understanding (and running manually) I find below observations.
- Model output of one batch is fed to
nll_loss's input. - Target lables (0 indexed) are fed to
nll_lossas target. -
nll_lossis simply returning the-sum(target * input)or-sum(input[target]). Which I believe should not be the case as negative log likelihood is defined assum(y*logp)
Please see below image.
Here in run number 67
nll_loss simply took negative of index 27 value from input and run number 69 took nll_loss as negative of index 11 from input.Why is that the case? Why it is not taking the log? Also, this is why I believe I was getting the negative loss as well. Both of the inputs in the above image are taken while debugging and running my above notebook.
@sgugger Please help please.
Thanks
CC: @jeremy Sorry for @ mention
