Making multiple predictions for image not saved on disk

StatisticDean · April 30, 2019, 2:02pm

I have the following situation. I trained a unet_learner for a superresolution task, and then exported it with

learn.export()

I want to perform prediction on new images. At the moment, this is the way I proceed :

I load the learner with learn = load_learner(model_dir, '1b.pkl')
My data is image of shape 128*128 with 3 channels. I loop through my list of numpy array of shape 3 * 128 * 128 and call learner.predict(a)[0].data.numpy().astype(np.float32)

My problem is that this is very slow, since predictions are virtually made with a batch size of 1. I would like to make my predictions on multiple images at once. Let’s say i have n_img images I want to predict, I would like to be able to call a function predict2 on an array of shape n_img * 3 * 128 * 128 that would return a tensor or a numpy array of the same size by doing predictions on images in parallel.

My reading of this topic : Making predictions in v1 makes me want to put my images in a test_set and use the get_preds method to achieve that. My problem here is that my images are not stored on my disk. I get directly an array of shape n_img * 3 * 128 * 128, theoretically, I could save this array as images on my disk, and use ImageImageList.from_folder to load those images but this seems pretty convoluted (and unnecessary and therefore slower).

So I guess my questions are the following :

How can I make an ImageImageList from a array of shape n_img * 3 * 128 * 128 without saving this array to my disk ?
Or if possible in a different manner, how can I get a prediction on an array of shape n_img * 3 * 128 * 128 that computes prediction in parallel and not 1 by 1.

sgugger · April 30, 2019, 2:12pm

At some point, you need to learn to use PyTorch directly
If you have an array of images that can become a batch, then put it in a tensor, normalize it the same way you normalized your training data and call

act = learn.model.predict(x)
preds = F.softmax(act)

StatisticDean · April 30, 2019, 2:28pm

Well, I guess it seemed natural to me to be able to do a prediction on an array of size n_img*3*128*128 with fastai with not much more trouble than a prediction on an array of size 3*128*128. Moreover, the fact that fastai makes it so easy to be used either on GPU or CPU depending on whats available always makes me want to try harder to get my stuff to work “only” through fastai.

Thanks for your solution, this should speed up my model a lot

rquintino · September 10, 2019, 3:17pm

hi @sgugger Is there a way of doing batch prediction directly from exported model? (important: without specific knowledge of transforms/normalization used , like we have in predict)

Images are being feed from memory only, using model.predict(Image(img_tensor[0])) but this goes image by image, there should be a better way right?

thanks!

sgugger · September 10, 2019, 3:26pm

You can load your data as a test set, as shown in the inference tutorial (end of the section on classification).

rquintino · September 10, 2019, 3:35pm

thanks @sgugger, section below?

but how to produce an ImageList from inmemory images ? Can I load it from array of tensors or similar?

(needs to be customimagelist? Load the whole dataset in RAM or Custom ImageList for "virtual" image patches/crops )

https://docs.fast.ai/tutorial.inference.html#A-classification-problem

" You can also do inference on a larger set of data by adding a test set . This is done by passing an ItemList to load_learner .

learn = load_learner(mnist, test=ImageList.from_folder(mnist/‘test’))

preds,y = learn.get_preds(ds_type=DatasetType.Test)
"