How to get an empty ConvLearner for single image prediction?

devforfu · October 23, 2018, 7:59am

Update: posting the fastai-compliant way to predict custom image class.

Ok, finally, I think here is a more or less “canonical” approach to generate a prediction using fastai standard classes and methods:

img = open_image(filename)
losses = img.predict(learn)
learn.data.classes[losses.argmax()]

Original Post

I’m training a model on some dataset like this:

data = ImageDataBunch.from_name_func(..., size=224)  # dataset creation goes here
data.normalize(imagenet_stats)
learner = ConvLearner(data, models.resnet34, metrics=[error_rate])
learner.fit_one_cycle(1)

Now I would like to run my model on some custom image. For example, let’s pretend that I have an image in my local file system. I read this image, and convert into a tensor of appropriate shape and type:

img_path = 'path/to/the/image.png'
pil_image = PIL.Image.open(img_path).convert('RGB').resize((224, 224))
x = torch.tensor(np.asarray(pil_image), dtype=torch.float)
w, h, c = x.size()
x = x.view(c, w, h).to(default_device)

Finally, I feed the image into model:

preds = learner.model(img[None])

However, the last step gives me an error:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   1364         size = list(input.size())
   1365         if reduce(mul, size[2:], size[0]) == 1:
-> 1366             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1367     return torch.batch_norm(
   1368         input, weight, bias, running_mean, running_var,

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

Could anyone advise, what is the correct way to prepare new data before feeding into the model? I mean, I would like to do something similar to model.predict(X) from scikit-learn, or keras.

I guess I need to apply normalization as well but I think that probably source of the error is something else.

devforfu · October 23, 2018, 8:10am

I guess I’ve fixed it =)

learner.model.eval()
learner.model(img[None].to('cuda'))

One should move model into evaluation mode at first as soon as we don’t need BN layers in testing phase.

Bhuvana_ka · October 25, 2018, 6:35am

Hi,
I am working on the MNIST dataset using
path = untar_data(URLs.MNIST_SAMPLE);

I want to run my model agains a custom image. How do i upload the image ?
I am using Salamander - Cloud GPU.

Any help would be appreciated.
Thanks

devforfu · October 25, 2018, 7:40am

You can use scp util for this purpose. Or, if you prefer graphical interface, there are such options like FileZilla or Transmit, depending on your OS.

gianferrarif · October 25, 2018, 8:52am

Don’t you think the image should be normalized using the imagenet weights before getting a prediction?

devforfu · October 25, 2018, 9:21am

Yes, you’re right, forget to update my solution. Here is a final snippet I am using:

@dataclass
class ConvPredictor:
    
    learner: ConvLearner
    mean: FloatTensor
    std: FloatTensor
        
    def __post_init__(self):
        device = self.learner.data.device
        self.mean, self.std = [torch.tensor(x).to(device) for x in (self.mean, self.std)]
        
    def predict(self, x):
        out = self.predict_logits(x)
        best_index = F.softmax(out).argmax()
        return self.learner.data.classes[best_index]
    
    def predict_logits(self, x):
        x = x.to(self.learner.data.device)
        x = normalize(x, self.mean, self.std)
        out = self.learner.model(x[None])
        return out
    
    def predict_from_file(self, filename):
        data = open_image(filename).data
        return self.predict(x)

And then:

img = open_image(fnames[6])
predictor = ConvPredictor(learn, *imagenet_stats)
predictor.predict(img.data)

Here is a link to my version of pets notebook, if interested.

gianferrarif · October 25, 2018, 9:31am

Great! One final thought. I think the image should be normalized not out of the statistics of the data, but using the official imagenet statistics like in cell 11: data.normalize(imagenet_stats)

devforfu · October 25, 2018, 10:35am

Correct, I am using *imagenet_stats from lecture to normalize predictions. Just decided to keep them as mean and std attributes in a case when a different statistics was used to normalize data.

gianferrarif · October 25, 2018, 2:49pm

I got correct predictions, however I think that we need to resize the image (i got a cuda out of memory)…

gianferrarif · October 25, 2018, 4:56pm

I suggest this implementation. It switches the model in learn mode and resizes the image parametrically

evan.xiong · October 26, 2018, 3:38am

Great code, maybe u can get the transforms from the learn.data.valid_ds.tfms, and apply on the images you are trying to predict

jeremy · October 26, 2018, 4:07am

It might be nice to try to update the code in your top post to be more idiomatic fastai code. Have a look at the examples in the applications->vision section of the docs for ideas. Let me know if you need any help with this.

gianferrarif · October 26, 2018, 6:50am

I think the transformations are performed at training time to perform augmentation. I would not re-apply them at evaluate time

evan.xiong · October 26, 2018, 6:54am

not necessarily, if u check the valid_ds.tfms, it is different from the train_ds.tfms… necessary transformation is required for handling your data even in evaluation time, e.g. resize… Here u r doing the resize when you loading ur img

gianferrarif · October 26, 2018, 6:56am

My fault! I will look at it!

devforfu · October 26, 2018, 10:33am

Sure, not a problem, I’ll update my transformations with a more standard approach. Will let you know if fail to bring everything into compliance with library =)

@gianferrarif @evan.xiong Guys, I am going to update the post message soon to incorporate your remarks for future reference.

gianferrarif · October 26, 2018, 11:15am

Also, I think I will focus on deserializing the model and the transformations, in order not to depend to a learner and creating a totallly standalone package for evaluation (potentially on cpu only), to be wrapped in a flask API. @devforfu @evan.xiong if you are cool with that, it can be our project for the mooc, maintains it for also nlp and tabular use cases. @jeremy please share your thoughts if you have any.

sgugger · October 26, 2018, 1:28pm

Note that there is an Image.predict method that takes a learner object now:

img.predict(learn)

It applies the valid transforms and normalization to the image, then returns the logits (soon the probabilities but we haven’t finished with that yet).

Also note in your code @gianferrarif that you don’t need to pass the logits to softmax before taking argmax (the maximum before and after softmax are the same).

devforfu · October 26, 2018, 1:30pm

Just seen this method in the repo a few minutes ago
Will update the post accordingly.

jeremy · October 26, 2018, 1:55pm

BTW using the open_image function is the best way to get an image with fastai.