Would the nll() function in “03_minibatch_training.ipynb” not have been better if it was:
def nll(input, target): return -input[:, target].mean()
instead of:
def nll(input, target): return -input[range(target.shape[0]), target].mean()
because with the latter you are always taking the first however many samples there are in target
, which might not correspond to those samples in target
?