How do we use our model against a specific image?

So we’ve trained a model and like what we have.

How do we take a image from your file system and use the framework to get a prediction?

I’m assuming we have to transform it into something appropriate, and then pass it into learn.model() … but not sure what transforms are needed and if this is even right.

Thanks - wg

4 Likes

Tried this …

normalize = torchvision.transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)
preprocess = torchvision.transforms.Compose([
   torchvision.transforms.Scale(256),
   torchvision.transforms.CenterCrop(224),
   torchvision.transforms.ToTensor(),
   normalize
])

img_pil = PIL.Image.open(f'{PATH}/valid/in-n-out/2.jpg')
img_tensor = preprocess(img_pil)
img_tensor.unsqueeze_(0)

img_variable = Variable(img_tensor)
print(img_variable.size())

fc_out = learn.model(img_variable)

Throws an exception on the last line:

RuntimeError: running_mean should contain 3 elements not 1024

1 Like

Great question. Do a git pull, and then try (this is from the dog breeds dataset):

Hopefully that also gives you some insight into what’s going on behind the scenes in our data classes…

12 Likes

OK I’ve simplified it quite a bit now :slight_smile:

25 Likes

Hi @jeremy can we use the learn.load command to load a set and then the code above to get a prediction so we don’t have to go through the whole code?

And secondly I notice that the save command saves the model as a .h5 file. Is there a way to save this as a .pb file?

Thanks

1 Like

Trying to write this myself based on what I’m seeing but still get an exception:

xforms, _ = tfms_from_model(arch, sz)
im = xforms(PIL.Image.open(f'{PATH}/valid/in-n-out/2.jpg'))
preds = to_np(learn.model(V(T(im[None]).cuda())))

Exception: running_mean should contain 3 elements not 1024

The code looks identical to yours so I’m not understanding what is going on.

Thanks.

Oi vey …

Read the code a little more closely and figured out that learn.model is different than learn.models.model.

Revised code works:

xforms, _ = tfms_from_model(arch, sz)
im = xforms(Image.open(f'{PATH}/valid/in-n-out/2.jpg'))
preds = to_np(learn.models.model(V(T(im[None]).cuda())))
np.argmax(preds, axis=1)

A little more long winded, but would eliminate any dependencies on the fast.ai framework in production (which might prove helpful if trying to put this in an Android or iOS app at some point):

torch_model = learn.models.model
img = Image.open(f'{PATH}/valid/in-n-out/2.jpg')

normalize = torchvision.transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)
preprocess = torchvision.transforms.Compose([
   torchvision.transforms.Scale(256),
   torchvision.transforms.CenterCrop(224),
   torchvision.transforms.ToTensor(),
   normalize
])

img_tensor = preprocess(img).unsqueeze_(0)
img_variable = Variable(img_tensor.cuda())

log_probs = torch_model(img_variable)
preds = np.argmax(log_probs.cpu().data.numpy(), axis=1)

print(preds)
6 Likes

Actually my code isn’t quite right - you should use the 2nd return val from tfms_from_model. The first one includes data augmentation, e.g. for the training set. For predictions you don’t want that, so use the 2nd one.

2 Likes

What’s a .pb file?

Absolutely!

1 Like

Yah I was just seeing that. Revised my code using fast.ai transforoms and helpers as such:

trn_tfms, val_tfrms = tfms_from_model(arch, sz)
im = val_tfrms(Image.open(f'{PATH}/valid/in-n-out/2.jpg'))

log_probs = to_np(learn.models.model(V(T(im[None]).cuda())))
preds = np.argmax(log_probs, axis=1)

print(preds)
6 Likes

I have a question regarding image size. If we are using RGB image of size 224, which is also an requirement of ResNet32 then does it mean that the input feature vector for that image size will be of size 224 * 224 * 3 = 150528. And these many input lines goes to CNN.

224x224 is not a requirement of any model we use. In fact, last lesson we trained a model with multiple image sizes!

1 Like

A silly question which I cannot hold: So if a dataset can have images of different sizes then how the input size of a NN is decided. On basis of what. That’s why we are supposed to bring all images to same size while training?. If this is true then I may have answered my former question.

The re-sizing happens as part of the transformations, that is why you have to pass the size (e.g. “sz”) parameter into it.

Im getting this error and would appreciate any suggestions. The error states its missing one argument but dont know how to fix


TypeError Traceback (most recent call last)
in ()
1 trn_tfms, val_tfms = tfms_from_model(arch, sz)
----> 2 im = trn_tfms(PIL.Image.open(f’data/test/320.jpg’))
3 preds = learn.predict_array(im[None])
4 np.argmax(preds)

TypeError: call() missing 1 required positional argument: ‘y’

And this is the code I am using:
trn_tfms, val_tfms = tfms_from_model(arch, sz)
im = trn_tfms(PIL.Image.open(f’data/test/320.jpg’))
preds = learn.predict_array(im[None])
np.argmax(preds)

Make sure you get the latest code cuz I was getting the same error before I did.

Also, you may want to look at the most recent examples in this thread that use the validation tfms.

@wgpubs

I did try both - same error. I did update to the latest lesson1.ipynb but still same error.


TypeError Traceback (most recent ca.ll last)
in ()
1 trn_tfms, val_tfms = tfms_from_model(arch, sz)
----> 2 im = val_tfms(PIL.Image.open(f’data/test/320.jpg’))
3 preds = learn.predict_array(im[None])
4 np.argmax(preds)

TypeError: call() missing 1 required positional argument: ‘y’

I got it from this article:

and quote

Protocol buffers
Protocol Buffers often abreviated Protobufs is the format use by TF to store and transfer data efficiently.
To recapitulate, you can use Protobufs as:
An uncompressed, human friendly, text format with the extension .pbtxt
A compressed, machine friendly, binary format with the extension .pb or no extension at all

I am semi-familiar with android software and am able to transfer already trained models to my phone but in order for me to use my own trained model I need to be able to save my model in that format. I have been able to save it using tensorflow but it doesn’t work on my phone and its difficult to debug why.

With the in class lessons and forums I have been able to understand your code far better so would be able to debug more effectively hence the need to save in that format.