Running the CAMVID notebook locally against a 1080Ti GPU and getting nan reported for the validation loss at every epoch. I reduced the LR by 10x and was able to get things working, but I would like to know …
1) Why does this occur?
2) What are the recommended steps to deal with it?
This is your moment @wgpubs Time to do some research! You might find something really cool if you can replicate the results and show those to everyone.
Hi Mr. Howard, in lesson2 of course-v3, when i follow the example code in that notebook, either using high LR or low LR, i can get the #na# valid_loss…and therefore i can’t get the valid_loss curve
I found the reason Jeremy pointed before. He said the main reason was your set of the learning rate. Don’t assign it a too high value! Otherwise, the ball may bump into another world and never come back —— NaN!