Loss is always NAN for validation set

Benjamin25 · June 3, 2020, 9:35am

Hi all,

I’m trying to implement object detection with a YOLOv3 architecture(darknet53 backbone, etc.) using the fastai library. I was using the coco tiny dataset with the default 0.2 split by rand pct for my validation set. However, I seem to keep getting NANs for my validation set even though it looks fine for my training set. The top parts are the last batches of my training set, and the bottom ones are the validation set batches.

Looking deeper, I tried printing the losses for each mini batch, and realised that the predictions being passed on to my custom loss function during evaluation are all NANs! This means that they somehow already become NANs on passing through the forward pass of the model. I’ve also tried changing between fastaiv1 and fastaiv2 and both have the same issue so I think it’s down to my code or me not setting something right.

Any ideas on what I could be doing wrong here? I’ve added the link to the notebook here as well. Thanks!!

Benjamin25 · June 6, 2020, 8:58am

I replaced the coco tiny dataset with the coco sample dataset instead and it ran ok. Shall use that instead.