NaN values when using precision in multi-classification

bdatko · December 9, 2019, 7:25pm

Hello,

I am seeing NaN values during the early epochs of my training when using prec = Precision(average='macro') as a metric. I am doing a multi-classification with image data using resnet34.

For the above figure, the y-axis is loss, and the x-axis is number of batches processed.

Why does NaN still appear after the first epoch?

I am assuming I am getting a NaN because there is a zero divide some where. This post makes mention of a NaN value when:

Whenever there is a class in a batch with no images then you would divide by 0 and therefore get nan.

I am confused by this because the Precision class calculates the precision at the end of the epoch, seen in the source code. Is this correct?

Can someone help me work through where I am going wrong?

prasadk · June 24, 2021, 3:11pm

I have had these precision values where it sometimes result in NaN. There are many instances for this occurrence,

The samples are wholly False negatives
There was no sample at all