Just completed the first lesson and went through the forum to figure out how to call predict on a single image.
After reading through a few different forum posts, my takeaway is that I need to run the following code:
img = open_image('sample-image.jpg')
Do I need to “normalize” the sample-image.jpg and also resize it to 224?
Other people have run this code without normalizing but resizing. I am confused because I think since the model was trained on normalized images, the new test image should also be normalized.
Thanks for the help.
I’m also interested in this. I’m seeing a lot of people especially in Keras/TensorFlow/Pytorch projects doing some data augmentation or pre-processing stuff before sending images to their pre-trained model in production. FastAI does that automatically when using load_learner and predict?
I tried different approaches, resize manually to 224 (same size as my training set), different resize_method, but I struggle to find the good approach, results differ a lot. I couldn’t find a way to normalize (if necessary) my single image to imagenet_stats before submitting it to my model.
Also interested, tungsten’s hypothesis seems to make sense.
Not sure about normalization (curious to know too) but in my experiments I found that you do need to resize image you use for prediction, so it is the same size as your training/validation set.
TL; DR: You don’t need to normalize your input because fast.ai do it for you.
When you call Databunch.normalize(norm_params)
def normalize(self, stats:Collection[Tensor]=None, do_x:bool=True, do_y:bool=False)->None:
"Add normalize transform using `stats` (defaults to `DataBunch.batch_stats`)"
if getattr(self,'norm',False): raise Exception('Can not call normalize twice')
if stats is None: self.stats = self.batch_stats()
else: self.stats = stats
self.norm,self.denorm = normalize_funcs(*self.stats, do_x=do_x, do_y=do_y)
self.add_tfm(self.norm), add the normalization transform.
So nay time you get an image from the batch it’s normalized.
If you try to use your model (remember that under the hood is a standard pytorch model) without fast.ai, you need to normalize your input manually (and probably reshape it to"size").
NOTE ON TFMS ORDER: the order of transforms is determined by the ‘order’ property, not only the order of original array - ie: resize is always the last one.
What kind of resize_methods are you using? crop, pad, squish? Squish seems to give good results to me, but I’m not sure if it’s the correct way to do.
img = img.apply_tfms(tfms=get_transforms(), size=224, resize_method=3)
Since I am dealing with spectrogram images (visual representation of sound) I cannot do any transforms that could distort time/frequency representation (like squish). So I do it old fashion way, just resizing images outside the model, while keeping the proportion of my original image size. I also deal with rectangular images, so when I resize I try to keep the 1.36 proportion.
I think you can also resize proportionally with tfms but I do not know how, so I just resize images outside of the model, basically preparing them for the test set.
If someone knows easy resize tfms transform that basically makes image smaller while keeping proportions, I would love to learn it too.
Does anyone have a code snippet to normalize a single fastai image with imagenet stats?
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
for i in range(3):
img.data[i, :, :] -= mean[i]
img.data[i, :, :] /= std[i]
resolving for resize
img_path = ‘static/sss.jpg’
img = open_image(img_path)
but im finding normalize for one single image, for predict img.