I have been using the 2019 version of fastai for training a vision model, exporting that model, importing it with load_learner, doing predictions on the same validation set I used during training and getting significantly different results. Is there an example in the fastai2 library somewhere that demonstrates how this process should be done so that you would get essentially the same results on your validation set as when you were during training?
The current code I am using:
learn = load_learner(".", “my_model.pkl”, test=ImageList.from_df(df, path=path))
preds, y = learn.get_preds(DatasetType.Test)
My predictions are definitely worse on the validation set than what they were during training. Am I missing something? The only thing I am doing that might be causing problems is how I am normalizing the data during training.
data = ImageDataBunch(train_dl=train_dl, valid_dl=val_dl).normalize()
I notice people usually have .normalize(imagenet_stats)
Do I need to normalize using imagenet stats as well? Do I need to do some other kind of processing after load_learner to make things match up? If there is a good example in the new fastai2 library I would be happy to switch over to that.
Your y’s here aren’t actually your real y’s, in v1 the test set is always unlabeled, which could be part of your issue. How are you measuring accuracy?
I’m following that up with this code:
from tqdm import tqdm_notebook
predictions = 
for i, (pred, gt_class) in tqdm_notebook(list(enumerate(zip(preds, y)))):
pred_probability, predicted_class = torch.topk(pred, 1)
predictions.append((i, predicted_class, pred_probability))
rows = 
for ds_idx, predicted_class, probability in predictions:
img = learn.data.test_ds.items[ds_idx]
rows.append((img, predicted_class, probability))
So I’m not using those y values. My predictions are clearly not just random guesses but definitely not what I’m getting while training. My best guesses on the problem are something to do with normalization or resizing the images. I would just love to see an example that demonstrates how this process should be done that clearly produces the same results with the deployed model. Either in the 2019 version or the latest one. Deploying these models to “production” seems to be a lower priority than other topics.
I have seen a lot of people asking about this so it appears to be problematic in the 2019 version. I was hoping deployment was improved somewhat in the latest version. I was also hoping someone might have some insight into that. If I should be using some other method to reliably deploy my models (and get similar results) I would love to hear any advice anyone may have.
Your code snippet shows
DatasetType.Test but your question is about the validation set. You should use
DatasetType.Valid to access the validation set (which is what is used to show the validation metrics during training).
It does a prediction on everything in the data frame. I have a column called valid so I know which images were part of the validation set while training. So this isn’t an issue.
You have to resize/normalize the same way before you submit an item to the model. The model trained using a very specific size and normalization so a “tree” looks very different to the model then it does raw. Try this and see how your results turn out. As a side note, the normalization is done using imagenet stats because that is how the pretrained weights were trained originally. Also, I believe there is some benefit to using those particular normalization values that google will be able to expand on
Awesome! How do you resize/normalize the image before you submit to the model?
Honestly, I would just use fast.ai to load the image and transform it or use pytorch transforms (they have normalization and resize). If you want to do it by hand I created a quick snipped to give you an idea of what is happening.
Cool! Thanks! I have been looking at the latest version of fastai and it appears that this is more intuitive now. I should probably focus my efforts on learning the new stuff.