How to get an empty ConvLearner for single image prediction?

Update: posting the fastai-compliant way to predict custom image class.

Ok, finally, I think here is a more or less “canonical” approach to generate a prediction using fastai standard classes and methods:

img = open_image(filename)
losses = img.predict(learn)
learn.data.classes[losses.argmax()]

Original Post

I’m training a model on some dataset like this:

data = ImageDataBunch.from_name_func(..., size=224)  # dataset creation goes here
data.normalize(imagenet_stats)
learner = ConvLearner(data, models.resnet34, metrics=[error_rate])
learner.fit_one_cycle(1)

Now I would like to run my model on some custom image. For example, let’s pretend that I have an image in my local file system. I read this image, and convert into a tensor of appropriate shape and type:

img_path = 'path/to/the/image.png'
pil_image = PIL.Image.open(img_path).convert('RGB').resize((224, 224))
x = torch.tensor(np.asarray(pil_image), dtype=torch.float)
w, h, c = x.size()
x = x.view(c, w, h).to(default_device)

Finally, I feed the image into model:

preds = learner.model(img[None])

However, the last step gives me an error:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   1364         size = list(input.size())
   1365         if reduce(mul, size[2:], size[0]) == 1:
-> 1366             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1367     return torch.batch_norm(
   1368         input, weight, bias, running_mean, running_var,

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

Could anyone advise, what is the correct way to prepare new data before feeding into the model? I mean, I would like to do something similar to model.predict(X) from scikit-learn, or keras.

I guess I need to apply normalization as well but I think that probably source of the error is something else.

4 Likes

I guess I’ve fixed it =)

learner.model.eval()
learner.model(img[None].to('cuda'))

One should move model into evaluation mode at first as soon as we don’t need BN layers in testing phase.

5 Likes

Hi,
I am working on the MNIST dataset using
path = untar_data(URLs.MNIST_SAMPLE);

I want to run my model agains a custom image. How do i upload the image ?
I am using Salamander - Cloud GPU.

Any help would be appreciated.
Thanks

You can use scp util for this purpose. Or, if you prefer graphical interface, there are such options like FileZilla or Transmit, depending on your OS.

1 Like

Don’t you think the image should be normalized using the imagenet weights before getting a prediction?

1 Like

Yes, you’re right, forget to update my solution. Here is a final snippet I am using:

@dataclass
class ConvPredictor:
    
    learner: ConvLearner
    mean: FloatTensor
    std: FloatTensor
        
    def __post_init__(self):
        device = self.learner.data.device
        self.mean, self.std = [torch.tensor(x).to(device) for x in (self.mean, self.std)]
        
    def predict(self, x):
        out = self.predict_logits(x)
        best_index = F.softmax(out).argmax()
        return self.learner.data.classes[best_index]
    
    def predict_logits(self, x):
        x = x.to(self.learner.data.device)
        x = normalize(x, self.mean, self.std)
        out = self.learner.model(x[None])
        return out
    
    def predict_from_file(self, filename):
        data = open_image(filename).data
        return self.predict(x)

And then:

img = open_image(fnames[6])
predictor = ConvPredictor(learn, *imagenet_stats)
predictor.predict(img.data)

Here is a link to my version of pets notebook, if interested.

8 Likes

Great! One final thought. I think the image should be normalized not out of the statistics of the data, but using the official imagenet statistics like in cell 11: data.normalize(imagenet_stats)

Correct, I am using *imagenet_stats from lecture to normalize predictions. Just decided to keep them as mean and std attributes in a case when a different statistics was used to normalize data.

I got correct predictions, however I think that we need to resize the image (i got a cuda out of memory)…

I suggest this implementation. It switches the model in learn mode and resizes the image parametrically

3 Likes

Great code, maybe u can get the transforms from the learn.data.valid_ds.tfms, and apply on the images you are trying to predict

It might be nice to try to update the code in your top post to be more idiomatic fastai code. Have a look at the examples in the applications->vision section of the docs for ideas. Let me know if you need any help with this.

I think the transformations are performed at training time to perform augmentation. I would not re-apply them at evaluate time

not necessarily, if u check the valid_ds.tfms, it is different from the train_ds.tfms… necessary transformation is required for handling your data even in evaluation time, e.g. resize… Here u r doing the resize when you loading ur img

My fault! I will look at it! :slight_smile:

Sure, not a problem, I’ll update my transformations with a more standard approach. Will let you know if fail to bring everything into compliance with library =)

@gianferrarif @evan.xiong Guys, I am going to update the post message soon to incorporate your remarks for future reference.

Also, I think I will focus on deserializing the model and the transformations, in order not to depend to a learner and creating a totallly standalone package for evaluation (potentially on cpu only), to be wrapped in a flask API. @devforfu @evan.xiong if you are cool with that, it can be our project for the mooc, maintains it for also nlp and tabular use cases. @jeremy please share your thoughts if you have any.

1 Like

Note that there is an Image.predict method that takes a learner object now:

img.predict(learn)

It applies the valid transforms and normalization to the image, then returns the logits (soon the probabilities but we haven’t finished with that yet).

Also note in your code @gianferrarif that you don’t need to pass the logits to softmax before taking argmax (the maximum before and after softmax are the same).

4 Likes

Just seen this method in the repo a few minutes ago :smiley:
Will update the post accordingly.

BTW using the open_image function is the best way to get an image with fastai.

3 Likes