Get_preds() is as slow as predict()

NavneetSajwan · June 25, 2020, 5:31am

I have to run inference on 6 images at a time. So, instead of using predict(), I created a batch of 6 images and tried get_preds() and pred_batch(). There is not much difference in time. How is this possible?

NavneetSajwan · June 25, 2020, 5:43am

With GPU, results are these,

Danielvs · June 25, 2020, 9:01am

@muellerzr has done some great work on inference which covers the inner workings of different inference approaches. This thread might be a good starting point for learning more about the speed of different approaches

NavneetSajwan · June 26, 2020, 8:18am

I saw the thread. My project is based on fastai v1 and he is shown results in fastaiv2.

muellerzr · June 26, 2020, 9:03am

The concepts can still apply to v1, its underlying PyTorch. (Some of it)

NavneetSajwan · June 26, 2020, 9:11am

I tried the last resort and made predictions directly using learn.model(image_batch) and surprisingly even that took around 18 seconds on CPU. @muellerzr
Screenshot from 2020-06-26 14-39-58

muellerzr · June 26, 2020, 9:17am

Then that’s your bottleneck basically. What’s your model? You could also try converting it over to torchscript

https://forums.fast.ai/t/speeding-up-fastai2-inference-and-a-few-things-learned/66179/25

NavneetSajwan · June 26, 2020, 9:18am

It’s a fastai Image segmentation model. Bsically a unet with resnet34 encoder.

muellerzr · June 26, 2020, 9:21am

That’s unsurprising then on the CPU, and it’s a very large model that won’t convert to torchscript well IIRC

NavneetSajwan · June 26, 2020, 9:26am

Actually I am surprised by the fact that there is no significant diffference in time, when passing images in batch versus passing images in a loop.

muellerzr · June 26, 2020, 9:29am

I may look into this shortly this morning then and see what I find too (I haven’t looked into CPU specific yet)

NavneetSajwan · June 26, 2020, 9:30am

thank you. waiting for your response

muellerzr · June 26, 2020, 9:41am

There’s not really much I can say about that, segmentation models need a GPU (atleast fastai’s) to be run efficiently. The timings I got were better (or worse sometimes) than what you have right now, using a very basic loop (either in v1 or v2), so it is a model bottleneck. (as I said earlier)