Productionizing models thread

Hi Dave,

As I mentioned in my above post, I am not sure Zeit is good options anymore for deployments for DL models with Docker images because of 100 MB limits.

However, AWS Beanstalk and Google App Engine should not be any issues as they don’t have 100 MB limits. They are enterprise production level services, so 100 MB size limit is not possible.

There are other issues with AWS Beanstalk for example, sometime using smaller ec2 instance create problems (or uploading starter pack zip directly without compressing at root level) or missing any detail exactly mentioned in AWS Beanstalk creating environment as mentioned in my blog post or guide.
All these documentation are not perfect and deployment are never easy steps, as packages keep updating and things keep breaking.
So, Try following them again patiently at high level and keep trying, let me know, if you face further errors.

Thanks Pankaj! I appreciate your HUGE help in putting together the production guides as alternatives to Zeit. As someone who’s never created a web app prior to this course, I love being able to show friends/family the “fruits of my labor” so to speak by deploying an app so thank you.

I imagine each service has its’ pros/cons and given how frequently libraries change, documentation is especially difficult. I’ll let you know if I face further errors with any of these services and thanks again.

1 Like

Hi everyone,

Thanks a lot for this thread, I was wondering if anyone had the chance to test the JIT from pytorch in order to convert a final ulmfit model to Torch Script via Tracing as in this tutorial ? Some weight sharing forbid it :
TracedModules don't support parameter sharing between modules

4 Likes

If any of you are struggling with cloud providers for deployment, I’d love for you to try Render . The official guide for fastai-v3 is here: https://course-v3.fast.ai/deployment_render.html

We don’t have any size restrictions on Docker images, and I’m around to answer questions and help with debugging. (I’m the founder/CEO of Render and previously built Crestle).

3 Likes

Thanks, for sharing Anurag, “Render” looks promising at first glance, Let me try to deploy few quick apps on web services and then will let you know the feedback.

A bit of a late comment, but this is how we use ULMFiT in production: https://github.com/inspirehep/inspire-classifier
It’s still based on fastai v0.7 (which we might change soon to v1). We deploy the whole thing to OpenShift and use a REST API for sending text data and get the classification scores back. It’s slow as the OpenShift instance is CPU-based, but we are trying to work around that.

Of course I would appreciate any feedback and comments, especially on if we can do better.

1 Like

Hello there.

I’d also like to know how to convert the ULMFiT model with Torch Script or ONNX.

What are the benefits of using something like Starlette with async and await rather than e.g. Flask? My understanding of what async and await do is pretty fuzzy, and my team is more comfortable with Flask, so I’m wondering under what circumstances it makes sense to use Starlette.

I think it’s just more lightweight. If you are comfortable with Flask, go with Flask. I am currently experimenting with Responder, which is built on top of Starlette. Just because I love the APIs that Kenneth Reitz does

1 Like

There’s also some potential performance advantage in using async and await, no? I’d like to know how that performance advantage works. (We will need fairly high throughput for a small number of consumers.)

I have a nice little docker and webapp which is ready to do inference on fastai vision models.

Deployment is as easy as mounting your export.pkl and overriding some environment variables.

I currently have it hosting two mini projects (on a single $10 vps from digital ocean):

Note, if you try this, fastai models trained <=1.0.44 won’t work on fastai >=1.0.46 so you need to be sure to use the right version of the docker. Additionally, dockerhub’s automated build system is broken at the moment, I have been manually building and pushing, but you may want to build your own.

I have an existing pth file trained in image captioning. I want to deploy in flask only to caption images. How can I do this?

I have just published an AWS Lambda deployment guide here. Feel free to check it out and give feedback.

3 Likes

Hello @nok.
Can you please tell me how did you push this app into production?
I know where the .ml domain comes from. I have a .tk one myself.
But how are you doing the predictions and everything with an UI on a .ml?
Are you forwarding it to another website?
I am stuck with production for my idea for weeks. A little help will be appreciated.
Thanks :slight_smile:

Hello everyone. I have to productionize a PyTorch BERT Question Answer model. The CPU inference is very slow for me as for every query the model needs to evaluate 30 samples. Out of the result of these 30 samples I pick the answer with the maximum score. GPU would be too costly for me to use for inference.

Can I leverage multi-core CPU inference for this?
If Yes, what is the best practice to do so?
If No, is there a cloud option that bills me only for the GPU queries I make and not for continuously running the GPU instance?

Ditto. Relevant discussion on the error in pytorch: https://github.com/pytorch/pytorch/issues/7904

Hi, is there any special fastai installation for inference only (possible CPU specific fastai) where I would just use learn.predict(img) method or so?

I guess the best I can get is pip3 install --no-deps fastai but I just wanted to confirm?

Hi, thanks for sharing!

In OpenVINO and Neural Compute Stick web sites, I found that the frameworks they support are ONNX, TF, Caffe and MXNet, but it does not mention Pytorch. To convert Pytorch -> ONNX -> OpenVINO, I found this workflow:

My question is, is this the same approach you followed to convert fastai model or is there any other approach?

Many thanks!

Hi Anurag, do you have any Flask examples? I have a Flask app successfully running on my local host and trying to understand how to upload that to Render. Thanks.

A really simple Flask quickstart: https://github.com/render-examples/flask-hello-world

1 Like