Fastai v2 models in production

What’s the recommended way to deploy fastai models in production and avoid all the dataloader and dataset overhead.

So far this has been pretty easy:

preds1 = learn.model.eval()(None, torch.from_numpy(learn.dls.valid_ds[cont_names].values).float().cuda()).detach().cpu().numpy()

But I’m not sure how to get the trained procs in order to apply all the changes to the data ?

I can’t find a good way to go from a numpy array to making predictions for a dataset, to finding those same indices in the array and merging the predictions with the original data (not just the training inputs), if we got the dataloader route and do preds,targs learn.get_preds(ds) #no indices here

Thanks!

1 Like

You should use the test DataLoader with your items. IE:

dl = learn.dls.test_dl(items)

And then get predictions.

1 Like

Hi @feribg,

In case it helps, I just deployed a fastai2-based web API that derotates pictures (understands if a picture is rotated, and possibly sets it back straight). I don’t think it is super-useful, but it may be interesting as a deployment template because:

  • It uses FastAPI behind the scenes which provides automated doc generation, and many other features that I quite frankly don’t really understand :laughing:
  • It provides both img2class and img2img APIs
  • There is in the repo a notebook which shows how to call the API in various different context (I strangely struggled quite a lot with this part).

There is also a web-UI, but it currently only does the classification (it does not derotate the pictures). I started looking for how to do an img2img web app, but I struggled with it since I have no experience in HTML and JS. If anyone has a good way to do it, please let me know!

Hope you will find it helpful :).
Seb

4 Likes

Thanks @sebderhy but I don’t actually see how you apply the dataset transformation steps, I see the main piece here but it just gets a raw byte array from the file upload right?

learn.predict applies any augmentation done to the validation set.

Got it, thanks @muellerzr, so how does it all work when the input in pred doesn’t seem to have a batch dimension:

    img_bytes = (file.file.read())
    pred = learn.predict(img_bytes)

Does it also apply a unit axis to the dataset passed ?

Go explore predicts source code :wink:

It makes a batch of 1

Thanks yep I think I’ll need to get the source and dive deeper there’s a lot of indirections.

I have a doubt, why The unet are so heavy.
I have to segment some images that are 3000x1500 and I can only feed one at a time. (in my 8GB card)
Is that heavy? I suppose that down the U, we have something like 512x3000x1500/2**4, so it is quite heavy. Any idea how to “lighten this up” for production?
Also, any tip to remove the Resize transform from the pipeline more elegant than doing:
learn.dls.after_item.fs.pop(1)
where 1 is the position of Resize.

I don’t think so

Perhaps breaking up the image into 4 or 5 quadrants, run it as a batch, then bring it back together? (or more)

Don’t know, One image eats up 7.9GB for a xresnet34 based unet.

Yeah it does have a very large memory usage. I’d try the break up method as small as you reasonably can get. Since no augmentations besides normalization should be applied I think you’d be okay.

probably the solutions comes from something like this: https://devblogs.nvidia.com/speeding-up-deep-learning-inference-using-tensorrt/

1 Like

Hello Zach! Thanks for your responses on the forums. I have found them very helpful. I am new to Fastai, and got stuck with get_pred. Could you offer some help please?

I have built a model classifying pictures, my end goal is to label all 20K pictures I have for further learning.
At the moment I have created a list of image paths (images), and I used iteration to label everything, as follows:

labels = []
for idx, image in enumerate(images):
label, _, _ = learn.predict(image)
labels.append(label)

This has been very slow. I want to use get_preds, but I couldn’t work out how to do it from the list of image paths to prediction.

TIA!

Lu

@feribg @muellerzr Due to the large-scale image sizes and img2img complexity why can’t we try creating a web app that might perform GPU inference? I have usually seen web apps only for CPU inference in img2img problems. Also, I am personally trying to create a web app using uvicorn+starlette for GPU inference, but indeed I am facing a lot of challenges with that.