Segmentation Loss Suddenly Spikes mid training

bluesky314 · December 4, 2019, 3:10pm

I am training a binary human segmentation model on human data and my loss/accuracy suddenly job mid-training for no apparent reason. I initlise my model as :

learn = unet_learner(data, models.resnet34, metrics=metrics, wd=1e-2,loss_func=lossf)

and train in two frozen and unfrozen parts. Loss is just plain cross-entropy. The loss jumps in the first part and remains there throughout. Here is how the logs look:

What could be going wrong?

muellerzr · December 4, 2019, 3:13pm

If I had to guess you overfit your model, it hints at it (train loss higher than Val loss). How big of a dataset are you using?

bluesky314 · December 4, 2019, 4:01pm

Yes but even train loss spikes. Shouldnt it go very low then? Dataset is 6000 images

digitalspecialists · December 4, 2019, 4:15pm

I’ve had this before and never worked out the perfect reason. Lowering the max lr was how I approached it (while keeping min-lr the same).

dmilush · December 4, 2019, 4:37pm

It seems your training diverges, most probably due to improper learning rate used for this problem. Setting a proper learning rate will most probably help you solve this or at least diagnose the problem further. If I were you, I would use the learning rate finder as a start (learn.lr_find(); learn.recorder.plot()). Then if still not clear, you can post the results of using lr_find() here and we could further help.

bluesky314 · December 5, 2019, 4:29pm

Yes lowering the learning rate solved the problem. Thanks everyone