How to get an empty ConvLearner for single image prediction?

OK.
So it just boils down to

The only thing I need to understand is how to restore from serialization a learner without having to create before a DataBunch.

I would like to create a runtime bare minimum package for model serving (potentially on a CPU only box), and I find it cumbersome to have to pass the original sets.

F

2 Likes

Just published a medium blog post! https://medium.com/@francesco.gianferraripini/buon-appetito-a-fast-ai-spin-on-italian-food-ee14631bbdb6 that uses our snippets…

3 Likes

Will this method work on a batch of images? I would like to do “bs” number of classifications since I have a lot of classifications I want to do based on the trained model. BTW, I assume that I can drop the argmax and the index into classes to get the probabilities of all the classes. Thanks.

https://forums.fast.ai/t/how-to-get-an-empty-convlearner-for-single-image-prediction/28245 here we can find a standalone way to restore a model in cpu only environment

I’ve merged 2 threads into one - so don’t be confused if the discussion above seems a little bifurcated! :slight_smile:

2 Likes

I setup an notebook with all three approaches for a single image prediction:
(GitHub link, go to header “Single image prediction with ResNet34”):
1.) Setup full data object with ConvLearner
2.) Setup empty data object with ConvLearner
3.) Setup model without learner

However with approach 2 & 3 I get the wrong result, i.e., it always predicts class #4 (pos. 3).

The class order shouldn’t be mixed up, as I load for all three approaches the same weights, or did I miss somewhere something?

Did somebody encountered the same strange behavior?

Edit: Renamed and cleaned up notebook and updated link in this post.

I that generally would be helpful to have a topic on the forums dedicated to create APIs and “productionalize” the models on different environments, in the various domains…

Here you go:

Update: posting the fastai-compliant way to predict custom image class.

Ok, finally, I think here is a more or less “canonical” approach to generate a prediction using fastai standard classes and methods:

img = open_image(filename)
losses = img.predict(learn)
learn.data.classes[losses.argmax()]

Original Post

I’m training a model on some dataset like this:

data = ImageDataBunch.from_name_func(..., size=224)  # dataset creation goes here
data.normalize(imagenet_stats)
learner = ConvLearner(data, models.resnet34, metrics=[error_rate])
learner.fit_one_cycle(1)

Now I would like to run my model on some custom image. For example, let’s pretend that I have an image in my local file system. I read this image, and convert into a tensor of appropriate shape and type:

img_path = 'path/to/the/image.png'
pil_image = PIL.Image.open(img_path).convert('RGB').resize((224, 224))
x = torch.tensor(np.asarray(pil_image), dtype=torch.float)
w, h, c = x.size()
x = x.view(c, w, h).to(default_device)

Finally, I feed the image into model:

preds = learner.model(img[None])

However, the last step gives me an error:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   1364         size = list(input.size())
   1365         if reduce(mul, size[2:], size[0]) == 1:
-> 1366             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1367     return torch.batch_norm(
   1368         input, weight, bias, running_mean, running_var,

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

Could anyone advise, what is the correct way to prepare new data before feeding into the model? I mean, I would like to do something similar to model.predict(X) from scikit-learn, or keras.

I guess I need to apply normalization as well but I think that probably source of the error is something else.

5 Likes

Is there a better, less verbose perhaps, way than this …

img = open_image(PATH/'valid/in/112.jpg')
train_tfms, val_tfms = get_transforms()
img = apply_tfms(val_tfms.append(normalize_funcs(*imagenet_stats)), img, size=224)

backbone = create_body(arch(models.resnet34), -2)
head = create_head(num_features(backbone) * 2, 2)
m = nn.Sequential(backbone, head)
m.load_state_dict(torch.load(PATH/'models/stage-2-34.pth'))

m.eval()
log_probs = m(img.data.unsqueeze_(0))

preds = torch.argmax(log_probs, dim=1)
print(preds)
# => tensor([1])

This works but I can’t help imagining that there is a better way (especially when it comes to the 3 lines above to build the model which are essentially pulled out of ConvLearner.

btw, I really love the open_image function … makes grabbing the image and using it as a tensor so easy :slight_smile:

Hoping this thread can be used by folks to submit what they believe is the best way to run a single example through their models (I highly suspect that mine is the one).

6 Likes

Not sure why you don’t want to use ConvLearn. Instead what is wrong with saving the convlearn. If you want to run for a single img, then load the image as (1,3,224,224) and then run the forward pass on the model convlearn.model(V(x)) !

Because a DataBunch object is required to create a ConvLearner and you cannot load a saved model until it is created.

Edit: Due to the merge of the posts see my last post above:
https://forums.fast.ai/t/how-to-get-an-empty-convlearner-for-single-image-prediction/28245/46

I tried you approach and in my case it seems to always predict the same class (class #3).

When I load the convlearner.model() I can predict it correctly with Image.predict().

You can find my notebook here: https://github.com/MicPie/fastai_course_v3/blob/master/L1-stonefly_activiations.ipynb
The prediction part starts with the heading “Analysis of ResNet34 activations for a specific images (work in progress)”.

I would be curious if somebody else encountered this strange error too?

You can see here someone that created a ConvLearner with some fake data object.

BTW since you’re using fastai, you must be on py36+, so you can make it even simpler:

f"/{c}_1.jpg"

(@simonw I’m sure you’re aware of this already - this message is for other folks reading who are new to Python.)

@devforfu @simonw @MicPie @wgpubs @gianferrarif I’ve added a first attempt at a single image prediction method here:

You’ll need to use the master version of fastai to run this. (It also shows use of the fastai.data_block API that we’ll be showing on Tuesday.)

10 Likes

Thanks Jeremy … I’m going to spend more time tomorrow looking at this.

Having thought about this architecturally today, it still feels off that we have to create a Learner or a DataBunch for real-time inference with a trained model. When I think of Learner and DataBunch classes, I think of things that are all about training a model … not making predictions in a production system.

I’m wondering if it may be better to separate the building of the actual NN with it’s being used by a Learner for training. Something maybe like this …

def create_cnn_arch(pretrained_model, n_classes=2, weights ):
  hasWeights = weights is not None

  backbone = create_body(arch(pretrained_model(pretrained=not hasWeights), -2)
  head = create_head(num_features(backbone) * 2, n_classes)
  m = nn.Sequential(backbone, head)

  if (hasWeights): m.load_state_dict(torch.load(weights)
  return m, get_transforms()[1].append(normalize_funcs(*imagenet_stats))


img = open_image(PATH/'valid/in/112.jpg')
nn, tfms =  create_cnn_arch(models.resnet34, PATH/'models/stage-2-34.pth')

img.predict(nn, tfms, size=224)

The idea would be that ConvLearner (or whatever it is named now) would use create_cnn_arch as well.

Just throwing this out here. I know its all pseudo-code but hopefully it makes sense what I’m trying to get at.

-wg

2 Likes

Hm, looks interesting! Something like array transforming pipeline from scikit-learn, or probably LINQ. Looking forward to knowing more about this thing, and how it is related to older APIs. As I can see, it is kind of convenience wrapper to split data bunch building process into small chunks instead of passing everything into a ​single constructor.

I don’t see why - I think they work great for inference. Try it and see!

Otherwise, you’ll need to replicate the same load/transform/etc pipeline yourself, and now you’ve got the same steps in two places to keep up to date.

I see what you are saying w/r/t the Learner … it does make sense to be able to use it so that you don’t have to replicate building the architecture or transforms in two places.

However, I still don’t like the idea of requiring a DataBunch for real-time inference on a single example given that this class is defined as something that will “… bind together a train_dl , a valid_dl and optionally a test_dl , ensures they are on device and apply to them tfms as batch are drawn.” It’s only purpose at inference time is to provide the number of classes … and that fact that we are essentially creating a “dummy” instance to make this work gives me the sense it is the wrong approach.

Also, this code:


data = (InputList.from_folder(path)
        .label_from_re(r'^(.*)_\d+.jpg$')
        .random_split_by_pct(0.2)
        .datasets(ImageClassificationDataset)
        .transform(tfms, size=224)
        .databunch(bs=bs)
        .normalize(imagenet_stats))

… seems to assume I’m going to have a filesystem and also a bunch of images from which to build a DataBunch. But what if I’m running on something like AWS Lambda or any other serverless infrastructure? And why should this be necessary when all I really need is a Learner that is aware of how to build the model and do the transforms?

I’d propose that Learner be updated so that a DataBunch was optional and that functions like create_cnn include a option n_classes argument that can be used to create the architecture when creating a DataBunch-less instance. Something like …

learn = create_cnn(data=None, models.resnet34, pretrained=false, tfms=tfms, size=224, n_classes=2)

Then we could simply load are learned weights for this model in the usual way …

learn.load('stage2-34.pth')

From here we can do prediction on a single example as such:

img = open_image(PATH/'valid/in/112.jpg')
img.predict(learn)

What do you all think? As far as I can tell, this approach is more straightforward, doesn’t require us monkeying around with a DataBunch unnecessarily, and accomplishes the mission of not having to replicate code.

1 Like