Reasons for NAN loss on CamVid/Tiramisu/Keras?

When training the Tiramisu on CamVid dataset, the model starts training with nan loss. How does one go about determining the cause of the nan value? I think either the data input might be suspect or Keras 2 fit_generator is the issue.

I checked the input and none of the values appear to be NaN. Even training on one batch leads to NaN loss.

1 Like

Your gradients are exploding. Try using tanh as the activation function.

1 Like

The issue was much simpler actually, not having the correct number of classes. If the input is valid and hyper-parameters are moderate, then there is a mistake or bug somewhere.

Hey there. Any chance you could point me to how to practically fix NaN losses on Camvid/Tiramisu?
How did you notice about the number of classes not being correct?

Edit: the label_colors file contained 32 classes, while the tiramisu was expecting 12. Just for posterity, that’s the thing to fix!

1 Like