Train_loss and valid_loss look very good, but predicting really bad

Update 3

I’m thinking that it might be my detector code mistake.
So, here is my code for using the trained learner/model to predict images.

import requests
import cv2

bytes = b''
stream = requests.get(url, stream=True)
bytes = bytes +
a = bytes.find(b'\xff\xd8')
b = bytes.find(b'\xff\xd9')
if a != -1 and b != -1:
      jpg = bytes[a:b+2]
      bytes = bytes[b+2:]
      img = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.IMREAD_COLOR)
      processedImg = Image(pil2tensor(img, np.float32).div_(255))
      predict = learn.predict(processedImg)
      self.objectClass = predict[0].obj

and I read imdecode() method return image in B G R order.

Could it because of different channel data used when in training and detecting?

Update 2

Why is there a turn around at the end?? I have highlighten it by the red circle.

Update 1

After some research, I found that someone suggested to turn off Shuffle?
But that’s for the Keras + Tensorflow.
Do I need to turn off shuffle=False in Fast.Ai?

Hi there,

new to FastAi, ML and Python. I trained my “Birds Or Not-Birds” model. The train_loss, valid_loss and error_rate were improving. If I only trained 3 epochs, then the model worked(meaning it can recognize whether there are birds or no birds in images), then I increased to 30 epochs, all metrics look very good, but the model does not recognize things anymore, whatever images I input, the model always return Not-Birds.

here is the training output:

here are the plots of learn.recorder

Here is my code:

from import *
from fastai.metrics import error_rate
from fastai.callbacks import EarlyStoppingCallback,SaveModelCallback
from datetime import datetime as dt
from functools import partial

path_img = '/minidata'
train_folder = 'train'
valid_folder = 'validation'

tunedTransform = partial(get_transforms, max_zoom=1.5)

data = ImageDataBunch.from_folder(path=path_img, train=train_folder, valid=valid_folder, ds_tfms=tunedTransform(), 
                                  size=(299, 450), bs=40, classes=['birds', 'others'], 
data = data.normalize(imagenet_stats)

learn = cnn_learner(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(30, max_lr=slice(5e-5,5e-4))


Is it over-fitting? Although the graphs look very good.

Could someone point out where I get wrong?

What exactly do you mean?

The model I trained is for recognizing birds or not-birds.
When I trained it 30 epochs, the model got very good metrics e.g. train_loss, valid_loss, but it does not recognize birds any more, meaning whatever I present to it(the model), the model will always return prediction result as Not-Birds.

If the validation error rate is small, it means the trained model is predicting well on that set. So I suspect that the error is in the way you are presenting an image to the model.

What I suggest is: run get preds on the validation set. Pick one specific image that is predicted correctly by the model as birds. Then try predicting directly on that image. If the result is not-birds, you’ll have some clues about where to look for the bug.

Step 1

“run get preds on the validation set.”

my understanding to this sentence is: use the trained model to call
model.predict(all images in validation set).
Am I correct?

Step 2

Pick one specific image that is predicted correctly by the model as birds.

For example, there are 100 images in my validation set, the 3rd image is predicted correctly as birds, the file name, for instance, called b3.jpg

Step 3

Then try predicting directly on that image.

then you want me to do this:


If the result is not-birds, you’ll have some clues about where to look for the bug.

I do not understand what you mean. Let me translate what you asked me to do

You want me to predict my validation set which includes b3.jpg according to Step 1,
then you want me to predict the single image b3.jpg again?

I cannot see how this works.

Could you please elaborate your thought?

Thank you.

You verify that function1(data1) gives the correct answer. Now you know that the model works correctly for data1.

Next, you apply function2(data1). You expect function2 to do exactly the same as function1, but instead it gives the wrong answer. Now you have a specific example where funtion2 fails. So you unpack function2(data1) - studying docs, tracing code - to find out why it gives the wrong result.

P.S. See docs for Learn.get_preds().
Learn.predict does not take a filename. Refer to examples of how to use in fastai docs.

hi @Pomo thanks for your update.

I was using pseudo code, so don’t take the code in my reply as real code. I use it to help me to demonstrate what my understanding of your reply is.

Okie, now respond to your latest reply.
In my case, do you mean the function1() and function2() doing the same thing which is predicting.
When predicting images, what I have done is:

  1. load my saved trained model
    learn = load_learner(modelPath, exportedModel1)

  2. do the prediction
    prediction = learn.predict(processedImg)

but when training the model, it’s a totally different function called fit_one_cycle().

learn.predict() and fit_one_cycle() serve very different purposes, so I cannot compare these 2 to find out where I went wrong.

As you can see in my post, I have pasted in all my training code. I believe that something wrong in this code.

fit_one_cycle() does do a prediction on the validation set after each epoch. That is how it can print the validation error rate.

I do not see anything wrong with your training code offhand. In fact it looks from the output like it is training perfectly well. Therefore I suspect that the cause of your prediction error lies elsewhere. Immediately after training you should be able to predict on any single image from the validation set and get exactly the same result as at the end of the last epoch of training.

In these cases, there is no substitute for understanding and checking what each specific element of the problem should be, and seeing where it deviates from expectations. In other words, debugging the details.

I am sorry that I cannot help you further.

thanks @Pomo,
I can see the training went very well, but the real world testing e.g. when I presented a bird image in front of my camera, the trained model could not detect the bird.

Do you reckon I should not use images on my mobile screen to test it?
But it’s hard to test it in my backyard as I will have to wait a bird landed in my backyard to see whether it works or not.

Hi @franva,

The error rate on your validation data has been going down consistently. That implies that the model has been recognizing images in the validation set.

First I would suggest going through the images in the validation set where the model is going wrong. If something that’s obvious and wrong does not jump out then I suggest going through the data pipeline. The model has been trained validated on a certain types of images (things like size).
Try to make sure that the image size of the real world examples are not wildly different from the training and validation set.

For example, if you’ve trained and validated you model on images of size 200x200 and then you are asking the model to predict an image of size 2000x2000, I think the pipeline will be resizing the image and the resized image it’s difficult for the model to predict.

Hi @chatuur,

Thanks for your help.

Why is there a turn around at the end?? I have highlighted it by the red circle. You can see it in the Update 2 in the post.
What does it indicate?
I feel it is a problem, how can we fix it?

As to “trained again one size, but used in predicting in a different size” I think it’s not a problem. Please correct me if I was wrong. I think normalization will take care of “different size issue”.

data = data.normalize(imagenet_stats)

Isn’t it?

  1. The turn around at the end:
    I can’t pin point to the reason behind this. What I can suggest is try running lr_find once again when the loss starts being erratic.

  2. “trained again one size, but used in predicting in a different size"
    As far as my understanding goes the answer is No. I am speaking right now without looking at the source code so pardon me if my understanding of the normalize func is not accurate. However, let me elaborate the scenrio. The input size of the model is fixed. Now, if my model is trained on images of size 200x200. Now if I am testing it on an image of size 2000x2000 the normalize function will shrink the image in queston. Or in the reverse situation if the input image is of size 20x20 and the normalze func bloats it up to 200x200.

Now, the model has never seen such images right?
For example in the smaller image when bloated up details are lost. The image is much more pixelated. So the model might not know how to classify such images.

I hope I’m able to convey my point?

yes, I get your point about “trained in one size but used in different size”.

So if this is the cause, then there is no way that I could fix it.

Why? The images I got for birds are about the size 150 x 100, from my training code you can see that I resized them into 299 * 450.
When using the trained model, the camera input images are at better resolution: 2048 x 1080.

I cannot never find millions of bird images in such good resolution. So that means the AI detecting become impossible. However, as you can see that many training images in other projects, e.g. MNIST they are very small images, but when applied in recognizing characters, those input could be much bigger.

I think the place where I missed is I didn’t resize images before feeding them into trained model when detecting birds.

Do you know how to do it?

Also, please see my update 3. I found some other possible cause.