How to get an empty ConvLearner for single image prediction?

I’ve merged 2 threads into one - so don’t be confused if the discussion above seems a little bifurcated! :slight_smile:

2 Likes

I setup an notebook with all three approaches for a single image prediction:
(GitHub link, go to header “Single image prediction with ResNet34”):
1.) Setup full data object with ConvLearner
2.) Setup empty data object with ConvLearner
3.) Setup model without learner

However with approach 2 & 3 I get the wrong result, i.e., it always predicts class #4 (pos. 3).

The class order shouldn’t be mixed up, as I load for all three approaches the same weights, or did I miss somewhere something?

Did somebody encountered the same strange behavior?

Edit: Renamed and cleaned up notebook and updated link in this post.

I that generally would be helpful to have a topic on the forums dedicated to create APIs and “productionalize” the models on different environments, in the various domains…

Here you go:

Update: posting the fastai-compliant way to predict custom image class.

Ok, finally, I think here is a more or less “canonical” approach to generate a prediction using fastai standard classes and methods:

img = open_image(filename)
losses = img.predict(learn)
learn.data.classes[losses.argmax()]

Original Post

I’m training a model on some dataset like this:

data = ImageDataBunch.from_name_func(..., size=224)  # dataset creation goes here
data.normalize(imagenet_stats)
learner = ConvLearner(data, models.resnet34, metrics=[error_rate])
learner.fit_one_cycle(1)

Now I would like to run my model on some custom image. For example, let’s pretend that I have an image in my local file system. I read this image, and convert into a tensor of appropriate shape and type:

img_path = 'path/to/the/image.png'
pil_image = PIL.Image.open(img_path).convert('RGB').resize((224, 224))
x = torch.tensor(np.asarray(pil_image), dtype=torch.float)
w, h, c = x.size()
x = x.view(c, w, h).to(default_device)

Finally, I feed the image into model:

preds = learner.model(img[None])

However, the last step gives me an error:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   1364         size = list(input.size())
   1365         if reduce(mul, size[2:], size[0]) == 1:
-> 1366             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1367     return torch.batch_norm(
   1368         input, weight, bias, running_mean, running_var,

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

Could anyone advise, what is the correct way to prepare new data before feeding into the model? I mean, I would like to do something similar to model.predict(X) from scikit-learn, or keras.

I guess I need to apply normalization as well but I think that probably source of the error is something else.

5 Likes

Is there a better, less verbose perhaps, way than this …

img = open_image(PATH/'valid/in/112.jpg')
train_tfms, val_tfms = get_transforms()
img = apply_tfms(val_tfms.append(normalize_funcs(*imagenet_stats)), img, size=224)

backbone = create_body(arch(models.resnet34), -2)
head = create_head(num_features(backbone) * 2, 2)
m = nn.Sequential(backbone, head)
m.load_state_dict(torch.load(PATH/'models/stage-2-34.pth'))

m.eval()
log_probs = m(img.data.unsqueeze_(0))

preds = torch.argmax(log_probs, dim=1)
print(preds)
# => tensor([1])

This works but I can’t help imagining that there is a better way (especially when it comes to the 3 lines above to build the model which are essentially pulled out of ConvLearner.

btw, I really love the open_image function … makes grabbing the image and using it as a tensor so easy :slight_smile:

Hoping this thread can be used by folks to submit what they believe is the best way to run a single example through their models (I highly suspect that mine is the one).

6 Likes

Not sure why you don’t want to use ConvLearn. Instead what is wrong with saving the convlearn. If you want to run for a single img, then load the image as (1,3,224,224) and then run the forward pass on the model convlearn.model(V(x)) !

Because a DataBunch object is required to create a ConvLearner and you cannot load a saved model until it is created.

Edit: Due to the merge of the posts see my last post above:
https://forums.fast.ai/t/how-to-get-an-empty-convlearner-for-single-image-prediction/28245/46

I tried you approach and in my case it seems to always predict the same class (class #3).

When I load the convlearner.model() I can predict it correctly with Image.predict().

You can find my notebook here: https://github.com/MicPie/fastai_course_v3/blob/master/L1-stonefly_activiations.ipynb
The prediction part starts with the heading “Analysis of ResNet34 activations for a specific images (work in progress)”.

I would be curious if somebody else encountered this strange error too?

You can see here someone that created a ConvLearner with some fake data object.

BTW since you’re using fastai, you must be on py36+, so you can make it even simpler:

f"/{c}_1.jpg"

(@simonw I’m sure you’re aware of this already - this message is for other folks reading who are new to Python.)

@devforfu @simonw @MicPie @wgpubs @gianferrarif I’ve added a first attempt at a single image prediction method here:

You’ll need to use the master version of fastai to run this. (It also shows use of the fastai.data_block API that we’ll be showing on Tuesday.)

10 Likes

Thanks Jeremy … I’m going to spend more time tomorrow looking at this.

Having thought about this architecturally today, it still feels off that we have to create a Learner or a DataBunch for real-time inference with a trained model. When I think of Learner and DataBunch classes, I think of things that are all about training a model … not making predictions in a production system.

I’m wondering if it may be better to separate the building of the actual NN with it’s being used by a Learner for training. Something maybe like this …

def create_cnn_arch(pretrained_model, n_classes=2, weights ):
  hasWeights = weights is not None

  backbone = create_body(arch(pretrained_model(pretrained=not hasWeights), -2)
  head = create_head(num_features(backbone) * 2, n_classes)
  m = nn.Sequential(backbone, head)

  if (hasWeights): m.load_state_dict(torch.load(weights)
  return m, get_transforms()[1].append(normalize_funcs(*imagenet_stats))


img = open_image(PATH/'valid/in/112.jpg')
nn, tfms =  create_cnn_arch(models.resnet34, PATH/'models/stage-2-34.pth')

img.predict(nn, tfms, size=224)

The idea would be that ConvLearner (or whatever it is named now) would use create_cnn_arch as well.

Just throwing this out here. I know its all pseudo-code but hopefully it makes sense what I’m trying to get at.

-wg

2 Likes

Hm, looks interesting! Something like array transforming pipeline from scikit-learn, or probably LINQ. Looking forward to knowing more about this thing, and how it is related to older APIs. As I can see, it is kind of convenience wrapper to split data bunch building process into small chunks instead of passing everything into a ​single constructor.

I don’t see why - I think they work great for inference. Try it and see!

Otherwise, you’ll need to replicate the same load/transform/etc pipeline yourself, and now you’ve got the same steps in two places to keep up to date.

I see what you are saying w/r/t the Learner … it does make sense to be able to use it so that you don’t have to replicate building the architecture or transforms in two places.

However, I still don’t like the idea of requiring a DataBunch for real-time inference on a single example given that this class is defined as something that will “… bind together a train_dl , a valid_dl and optionally a test_dl , ensures they are on device and apply to them tfms as batch are drawn.” It’s only purpose at inference time is to provide the number of classes … and that fact that we are essentially creating a “dummy” instance to make this work gives me the sense it is the wrong approach.

Also, this code:


data = (InputList.from_folder(path)
        .label_from_re(r'^(.*)_\d+.jpg$')
        .random_split_by_pct(0.2)
        .datasets(ImageClassificationDataset)
        .transform(tfms, size=224)
        .databunch(bs=bs)
        .normalize(imagenet_stats))

… seems to assume I’m going to have a filesystem and also a bunch of images from which to build a DataBunch. But what if I’m running on something like AWS Lambda or any other serverless infrastructure? And why should this be necessary when all I really need is a Learner that is aware of how to build the model and do the transforms?

I’d propose that Learner be updated so that a DataBunch was optional and that functions like create_cnn include a option n_classes argument that can be used to create the architecture when creating a DataBunch-less instance. Something like …

learn = create_cnn(data=None, models.resnet34, pretrained=false, tfms=tfms, size=224, n_classes=2)

Then we could simply load are learned weights for this model in the usual way …

learn.load('stage2-34.pth')

From here we can do prediction on a single example as such:

img = open_image(PATH/'valid/in/112.jpg')
img.predict(learn)

What do you all think? As far as I can tell, this approach is more straightforward, doesn’t require us monkeying around with a DataBunch unnecessarily, and accomplishes the mission of not having to replicate code.

1 Like

@wgpubs I think you’re misunderstanding. The code you quoted is used only to train the model, not for inference.

Also, the docs you’re quoting from have not yet been updated for the new single image API.

Ok I think I got you now.

So …

data2 = ImageDataBunch.single_from_classes(path, data.classes, tfms=tfms, size=224).normalize(imagenet_stats)
learn = create_cnn(data2, models.resnet34)
learn.load('one-epoch')

… could be re-written as …

data2 = ImageDataBunch.single_from_classes(path, ['dog','cat'], tfms=tfms, size=224).normalize(imagenet_stats)
learn = create_cnn(data2, models.resnet34)
learn.load('one-epoch')

I think I was getting hung up on your use of the data variable.

Also, what is this line for … data2.valid_ds.set_item(img)?

Also, also, is this going to play nicely with multilabel datasets? Or is the idea to tackle that when it comes up later?

1 Like

That’s me experimenting - I’m thinking that’s what I’ll use internally soon…

I got you.

I’m still partial to something more akin to what I’ve proposed … ImageDataBunch.single_from_classes to get a DataBunch seems strange to me API wise.

Nevertheless, I imagine either approach can be generalized for other applications (text, tabular, etc…) if we change things up a bit. This should work with either what you are suggesting or what I have.

For example, something like this for image classification …

learn = create_cnn(data=None, models.resnet34, pretrained=false, tfms=tfms, size=224, n_classes=2)
learn.load('stage2-34.pth')

img = open_image(PATH/'valid/in/112.jpg')
input = learn.prepare_example(img)
learn.predict(input)

… and something like this for tabular data based classification …

learn = get_tabular_learner(data=None, layers=[200,100], tfms=tfms, size=224, n_classes=2)
learn.load('stage-2.pth')

df = pd.read_csv(path/'adult.csv')
input = learn.prepare_example(df)
learn.predict(input)

prepare_example would be an abstract methond on Learner that each Learner subclass could implement to apply whatever transforms, etc… are needed to a single example in order to run it through it’s model.