Colab 04_mnist_basics.ipynb `An End-to-End SGD Example` not converging with or without GPU

Hi,

I thought I had posted this about a month ago. Silly me.

Makes the whole exercise a bit of a non-event. Great material though!

The End-to-End SGD Example of 04_mnist_basics.pynb does not converge if the GPU is enabled on Colab.

GPU enabled

image

first pass

image

second pass

image

GPU Disabled

image

first pass

image

second pass

image

Any ideas?

Cheers,
–Peter G

From the published notebook:

image

My notebook, not much action here…

for i in range(10): apply_step(params)

160.42279052734375
160.14772033691406
159.87269592285156
159.59768676757812
159.3227081298828
159.04774475097656
158.7728271484375
158.4979248046875
158.22305297851562
157.9481964111328

After 10 steps without GPU:

The first time I noticed this, before I was diverted for a month, the problem was only evident on the GPU.

I reported this as a bug on github three weeks ago - no response :frowning:

1 Like

Its a reported bug, with a suggested fix. in a fundamental beginner’s lesson. Nobody seems to be scanning for these. Revert "04_mnist_basics: MSE lacks .sqrt() (#327)" by amritpurshotam · Pull Request #450 · fastai/fastbook · GitHub

I am not sure what the process is for drawing attention. I have reported a bug on GitHub and tried on discord - maybe the incorrect Discord channel?

@jeremy

hi, I think the error on Colab notebook is this:

def mse(preds, targets): return ((preds-targets)**2).mean().sqrt()

It should be

def mse(preds, targets): return ((preds-targets)**2).mean()

mse (mean squared error) is not rmse (root mean squared error) and I think that the final sqrt make the loss instable (i.e. gradient of sqrt(0) is not defined)

Thanks for your reply. I was in the same situation and resolved the issue.