Simplest way to get batch prediction?

Is there a way to predict images in batches ? learn.predict() takes too long if number of images to predict is large.


I am myself exploring the library so not sure completely but I remember reading this, maybe it helps @ssaxena7 (sorry for @)

It can be made dependent on n_batch as a user passes param as well but code is written for a batch as it’s calling get_preds() with n_batch as 1 explicitly.

But this won’t run for whole test-set just a small batch of it depending on the bs also I believe (as far as I have tried in understanding the code block, again not very sure)

1 Like

I found the best way is to create DataBunch with test folder and then call learn.get_preds(is_test=True) to get predictions.


Four months later, does anyone know of a more complete approach to this issue and/or is it fully documented anywhere?

I am training a text classifier model on one machine, then using load_learner on a different machine, and ideally I would like to make batch predictions as I have 100s of documents at at time.

However I am not sure how to attach a TextClasDataBunch to my learner, to call get_preds or whatever I need to use to make batch predictions


Not sure if it is any help, but I needed to classify some images, and I wrote a simple loop.
I loaded the data using the ImageDataBunch (in your case you gotta use the TextClasDataBunch), and, as I needed only the validation set from the original training, I just used the valid_ds attribute. Then I ran:
img_arr = data.valid_ds.x #images, whereas y is the label array
preds = [model.predict(i) for i in img_arr]

I haven’t done anything with text, that’s why I didn’t get into the specfics.


why are you using .predict() which is meant for single images as opposed to .get_preds() which is meant for batches of images?


From what I understood from, the get_preds only works within the dataset used for training and validation/tests. But I guess I could set the data property inside the exported model and call .get_preds()

So you are using an exported model? Are you using load_learner? If so, it has an add_test option. Adding the dataset to the data attribute should also work.


Yes, I am using an exported model (with load_learner) for some tests, but I have the non exported one.

Thanks for clarifying this! :grin:

I went for the “dirty” solution while there was an elegant one.

Thanks again!

There is a nice way to do this now documented here:
learn = load_learner(mnist, test=ImageList.from_folder(mnist/'test'))
preds,y = learn.get_preds(ds_type=DatasetType.Test)


How do we map the filename to the y here?