Deploy PyTorch Models into Kubernetes Production in few minutes

avinregmi · March 1, 2019, 9:05pm

Hey guys, I always had a tough time deploying my model using Flask + Gunicorn +Nginx. It requires a lot of setup time and configuration. Furthermore, inferring the model with Flask is slow and requires custom code for caching and batching. Scaling in multiple machines using Flask also causes many complications. To address these issues, I’m working on Panini.

https://panini.ai/ can deploy PyTorch models into Kubernetes production within a few clicks and make your model production ready with real-world traffic and very low latency. Once deployed in Panini’s server, it will provide you with an API key to infer the model. Panini query engine is developed in C++, which provides very low latency during model inference and Kubernetes cluster is being used to store the model so, it is scalable to multiple nodes. Panini also takes care of caching and batching inputs during model inference.

This is the internal architecture, once your model is deployed:

Here is a medium post to get started: https://towardsdatascience.com/deploy-ml-dl-models-to-production-via-panini-3e0a6e9ef14

Let me know if you guys find this helpful. If you’re having a hard time deploying your model, email me.

cahya · March 29, 2019, 4:17pm

Hi @avinregmi, I tried you service, but somehow it doesn’t work. I used your cat and dog example, after I deployed it the process is hanging, it doesn’t show API key. “Pushing your model image to Kubernetes Cluster!” is the last line of the log:

—> ddda23a43034
Step 10/10 : ENTRYPOINT [“/container/container_entry.sh”, “py-closure-container”, “/container/python_closure_container.py”]
—> e8341d2dfefe
Successfully built e8341d2dfefe
Successfully tagged isvzqfgqgkfcmp1ioc6f6xq6rdo2-catdog4:latest
Your application isvzqfgqgkfcmp1ioc6f6xq6rdo2-catdog4 endpoint was sucessfully registered. Building your model. Please wait few minutes!
Pushing your model image to Kubernetes Cluster!

avinregmi · March 29, 2019, 4:36pm

Hey, @cahya thanks for trying Panini. Did you try recently or a few days ago? We were upgrading our backend hence the service might have been down a few days ago. I just tried it right now and it works for me. Did you upload main.py, requirements.txt and last_layers.pth? Try uploading these three files and selected bytes for input type. It should work. Here is a link I got when I upload these three files. https://api.panini.ai/gu0wrjnpy4cfesmbq4r6fiqxpjo2-cahya/predict

Let me know if you have further questions!

cahya · March 29, 2019, 4:39pm

Yes I tried it 15 minutes ago, and uploaded these 3 files.

avinregmi · March 29, 2019, 4:49pm

Hey @cahya, I think I found the issue. I’m assuming you were following my Medium guide which needs to be updated to the latest documentation. I’ve noticed you have uploaded a file called predict.py which was for the older version. Please go to my github Docs https://github.com/avinregmi/Panini_Tutorial/tree/master/PyTorch_CNN_Production
and download the latest three file and upload those three. main.py, requirements.txt and last_layers.pth

Let me know how it goes. I need to update the medium blogs. Sorry about that.

cahya · March 29, 2019, 4:50pm

Ok thanks, I will try it again.

cahya · March 29, 2019, 5:00pm

Ok, I get the API URL. I will test it later. Thanks for your prompt help

avinregmi · March 29, 2019, 5:01pm

I’m glad it worked for you. Let me know how it goes and I would love your feedback. If you can think of anything to make the service better, please let me know.

cahya · March 31, 2019, 8:55am

Hi, I tested the API, and it works properly. Now, I want to upload my application, it contains directory structure, how can I do it? Is it possible to prepare all needed files, including directories if it is necessary, in a zip/gzip file, and then the panini server uncompress it? It is similar like the deployment of java web application as war file.