While replicating the books code, I got such results while fitting a learner, what went wrong here?
Hi,
could you share a copy of the colab notebook reproducing this behaviour?
K.
Hi K, thanks for responding,
I found the issue after going through the code few times, I included sigmoid in my linear1 function and hence did not keep it in MNIST_loss function as I did not know learner requires sigmoid to be in the loss function.