Prediction value varies by call (predict_array)

Hello,

I tried to predict some classes with the resnext50 model using the following call sequence:

arr = np.stack(imagearrays)
arr = torch.FloatTensor(arr)
arr = arr.transpose(1,3).contiguous()
for ii in range(10):
    res = learn.predict_array(arr=arr)
    print("R:", res)

I get the following output, which slightly changes for each predict call:
R: [[-0.36225 -1.19109]]
R: [[-0.36265 -1.19016]]
R: [[-0.37965 -1.15233]]
R: [[-0.36384 -1.18745]]
R: [[-0.36067 -1.1947 ]]

I wonder if there is something happening behind the scences. I tried to set random.seed, load the model before each call, call learn.model(…) directly… it did not help.

Has someone an idea or experienced the same?

Edit: once I add the call “learn.model.eval()”, the output stays constant. But the prediction values are then way off, always predicting strongly a single class.

Thanks,

Christian

As you’ve found, you should call eval() before doing predictions. This disables dropout and learning for batch norm layers.

If your prediction values are way off, then double check to make sure your inputs are correctly normalized and preprocessed. (They need to be preprocessed in the exact same way as the images were when the model was trained.)

Yes, thanks for the hint. But in my case the input was fine. I found now another advice (https://discuss.pytorch.org/t/model-eval-gives-incorrect-loss-for-model-with-batchnorm-layers/7561/3). It seems that the model needs to process input for inference (not training) in order to adjust batch norm layers mean and std deviation values. Only after that the eval mode should be enabled. When I just presented a few images as input data several times, and then enabled eval mode, I got the expected constant results. I don’t know yet, if all will be fine when I present all training data before saving, and then later directly use the model in eval mode for prediction.

The batch norm layer’s moving averages are updated during training. In the beginning, when you just start out training the model, these averages may be very unstable. So when you run the model in evaluation mode you may see strange results. However, if that is the case you should see the same strange results every time.

As training proceeds, the averages should become more stable (as your model is learning more and therefore doesn’t see so many “surprises” anymore). If this does not happen, then your model may have problems learning (learning rate too high, for example).

It’s also possible your batch size is too small, so that the batch norm averages are updated on a single image or just a few images, making them more unpredictable (especially in the beginning).