Productionizing models thread

A bit of a late comment, but this is how we use ULMFiT in production: https://github.com/inspirehep/inspire-classifier
It’s still based on fastai v0.7 (which we might change soon to v1). We deploy the whole thing to OpenShift and use a REST API for sending text data and get the classification scores back. It’s slow as the OpenShift instance is CPU-based, but we are trying to work around that.

Of course I would appreciate any feedback and comments, especially on if we can do better.

1 Like

Hello there.

I’d also like to know how to convert the ULMFiT model with Torch Script or ONNX.

What are the benefits of using something like Starlette with async and await rather than e.g. Flask? My understanding of what async and await do is pretty fuzzy, and my team is more comfortable with Flask, so I’m wondering under what circumstances it makes sense to use Starlette.

I think it’s just more lightweight. If you are comfortable with Flask, go with Flask. I am currently experimenting with Responder, which is built on top of Starlette. Just because I love the APIs that Kenneth Reitz does

1 Like

There’s also some potential performance advantage in using async and await, no? I’d like to know how that performance advantage works. (We will need fairly high throughput for a small number of consumers.)

I have a nice little docker and webapp which is ready to do inference on fastai vision models.

Deployment is as easy as mounting your export.pkl and overriding some environment variables.

I currently have it hosting two mini projects (on a single $10 vps from digital ocean):

Note, if you try this, fastai models trained <=1.0.44 won’t work on fastai >=1.0.46 so you need to be sure to use the right version of the docker. Additionally, dockerhub’s automated build system is broken at the moment, I have been manually building and pushing, but you may want to build your own.

I have an existing pth file trained in image captioning. I want to deploy in flask only to caption images. How can I do this?

I have just published an AWS Lambda deployment guide here. Feel free to check it out and give feedback.

3 Likes

Hello @nok.
Can you please tell me how did you push this app into production?
I know where the .ml domain comes from. I have a .tk one myself.
But how are you doing the predictions and everything with an UI on a .ml?
Are you forwarding it to another website?
I am stuck with production for my idea for weeks. A little help will be appreciated.
Thanks :slight_smile:

Hello everyone. I have to productionize a PyTorch BERT Question Answer model. The CPU inference is very slow for me as for every query the model needs to evaluate 30 samples. Out of the result of these 30 samples I pick the answer with the maximum score. GPU would be too costly for me to use for inference.

Can I leverage multi-core CPU inference for this?
If Yes, what is the best practice to do so?
If No, is there a cloud option that bills me only for the GPU queries I make and not for continuously running the GPU instance?

Ditto. Relevant discussion on the error in pytorch: https://github.com/pytorch/pytorch/issues/7904

Hi, is there any special fastai installation for inference only (possible CPU specific fastai) where I would just use learn.predict(img) method or so?

I guess the best I can get is pip3 install --no-deps fastai but I just wanted to confirm?

Hi, thanks for sharing!

In OpenVINO and Neural Compute Stick web sites, I found that the frameworks they support are ONNX, TF, Caffe and MXNet, but it does not mention Pytorch. To convert Pytorch -> ONNX -> OpenVINO, I found this workflow:

My question is, is this the same approach you followed to convert fastai model or is there any other approach?

Many thanks!

Hi Anurag, do you have any Flask examples? I have a Flask app successfully running on my local host and trying to understand how to upload that to Render. Thanks.

A really simple Flask quickstart: https://github.com/render-examples/flask-hello-world

1 Like

Thanks Anurag, appreciate the help and look forward to leveraging render!

I am trying to change model_dir of cnn_learner . Because ıf I use default path ,its warning about read-only file system at kaggle and I want to export it to tmp/models/ . How can ı change the path or am I doing wrong ?

You can pass model_dir=Path(bla) in the arguments of cnn_learner, with bla pointing to a directory you can write.

Thanks for help . I did it like that and it worked . But now how can ı changing path of learn.export . with fname or Path() ? Should ı read document ?

In learn.export, you specify the location you want with learn.export(file = Path(bla)) where bla is a file name (with full location) ending in .pkl

1 Like