I am training a binary human segmentation model on human data and my loss/accuracy suddenly job mid-training for no apparent reason. I initlise my model as :
and train in two frozen and unfrozen parts. Loss is just plain cross-entropy. The loss jumps in the first part and remains there throughout. Here is how the logs look:
It seems your training diverges, most probably due to improper learning rate used for this problem. Setting a proper learning rate will most probably help you solve this or at least diagnose the problem further. If I were you, I would use the learning rate finder as a start (learn.lr_find(); learn.recorder.plot()). Then if still not clear, you can post the results of using lr_find() here and we could further help.