Productionizing models thread

I am glad you also made it work. Great work there.
If there is interest I can try to explain a little better the deployment and create more documentation on the readme.

Indeed there is no auto tree-shaking. When I did it, I had to choose which of the libraries I would include. That is why I didn’t even include the actual fastai, but only Pytorch related dependencies. I apologize if there was any trouble. I am not sure if I left any unnecessary dependencies there. I can double check.

It is really not possible to fit the model and the code into the lambda space. We have to read it from s3 every time which might make loading a little bit slower.

Did you remove code by actually deleting the libraries rather than just excluding them from the payload? That may be why I missed it.

I would encourage you to expand on what you’ve written already. I found it very useful, but still struggled (a lot) with actually getting it deployed. Partly due to having a slow uplink at the time I was working on (250MB uploads on 1Mb/s make for slow iterations).

Particularly if you could expand on my list of excludes I think that might be helpful. And in general, just clean up your repo a bit to make it ‘accurate’ for deploying on lambda. Your latest commit for example is attempting to use scratch space rather than load into memory directly.

I actually spent several hours trying to make that work because I thought you were able to :slight_smile: Glad to know you ended up loading from s3 directly as well.

Thank you for the feedback!
I did not really remove the libraries manually
I will update the repository

Does anyone run successfully fast.ai on embedded system like a raspberry pi ? Is it enough power for deep learning or can you suggest me alternatives ?

Thank you in advances

2 Likes

Does anyone know of a way to uninstall the Zeit Now npm CLI tools? I’ve never heard of this company before this class, and I’m hesitant to install something to my terminal – I don’t know if/what/ or how it’ll talk back to Zeit while I’m not looking.

Apparently a bunch of other people have been having trouble uninstalling both the desktop app and the CLI tools:

Looks like for the CLI you’ll need to hunt for its files manually:

And the desktop app will leave a lot of files as cache behind:


I’m trying this out on a VM, and maybe a cloud VM if that doesn’t work. It’s a hastle, but I don’t want to take the risk of adding junk or security threats to my computer. How safe are these tools to use?


edit: I think a way to uninstall zeit now cli tools is to remove: /usr/local/lib/node_modules/now/, though that’s on Ubuntu Linux; unsure about other OS’s, and not sure if that would lead to a clean uninstall – or if other files would be left around, or changes to config files & etc.


edit2:

Has Zeit deployment changed? I’m running Now from the course deployment docs and I’m getting failed builds after about 30-min wait time after entering now.

Sometimes this’ll fail at step 3/9: Step 3/9 : RUN apt install -y python3-dev gcc and using cache, other times it’ll fail at Step 7/9 and say that it couldn’t find some numpy umath library.

I’ll update this if I find a solution.


edit3:

I tried again with a fresh start. I created a new VM on GCP. Once online it couldn’t find npm via sudo apt install npm so I set up the machine via (see: stackoverflow link):

curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
sudo apt-get install -y nodejs
sudo apt-get install -y build-essential
sudo apt install npm
sudo npm install -g now
sudo npm i -g --unsafe-perm now

Then following the instructions from the fast.ai deployment tutorial:

wget https://github.com/fastai/course-v3/raw/master/docs/production/zeit.tgz
tar xf zeit.tgz
cd zeit
now

After confirming email and running now again, this was the result:

jupyter@instance-1:~/zeit$ now
> WARN! You are using an old version of the Now Platform. More: https://zeit.co/docs/v1-upgrade
> Deploying ~/zeit under blah@gmail.com
> https://zeit-vybcijlirb.now.sh [v1] [in clipboard] (sfo1) [2s]
> Building…
> Sending build context to Docker daemon  31.74kB
> Status: Image is up to date for python:3.6-slim-stretch
>  ---> Using cache
>  ---> 370bd47378c2
> Step 7/9 : RUN python app/server.py
>  ---> Using cache
> Step 3/9 : RUN apt install -y python3-dev gcc
>  ---> 9799e0e87f00
>  ---> Using cache
>  ---> 68038eb9c796

> Error! Build failed

The deployment site itself looked like this. It hung on Storing image for most of the time until returning the Build failed error in terminal.

@arunoda any thoughts? My guess is the tutorial code/docker was configured for an earlier version of fastai.

Hey @Borz, I have the exact same issue and can no longer deploy apps on Zeit using the command line (mine also ends with ‘Error! Build failed’ each time). I previously deployed the exact same app on Zeit about a week ago with no issues.

When I try to deploy from Zeit’s Now desktop app (instead of typing ‘now’ in the command line), the error says:

"The built image size (1.8G) exceeds the 100MiB limit" (see screenshot below)

Does anyone know how to resolve? I imagine a change to a library or requirement is driving this size increase, but I don’t have the technical expertise to ID the root cause. The free tier with Zeit requires individual files be < 100MB.

@pankymathur - thanks for the new guide on deploying apps on AWS Beanstalk and Google’s App engine that you mentioned in this post with the new guide here. Unfortunately, I ran into the same issue on AWS (as I did with Zeit) using the starter pack. Any thoughts on a fix for AWS or Zeit?

I think the v1 zeit now didn’t have the same 100MB restriction? I may be wrong. FWIW I think I got a webapp deployed via Google App Engine from the course tutorial. Just gotta check on turning it off since I don’t think it’s free.

I’d be interested if someone gets zeit working; my guess is it requires some fiddling with the app/server.py or some other file, but I’m not diving into that right now.

Thanks. I can confirm the v1 zeit now also had the 100MB (per individual file) restriction.
Originally, my model (.pth file) was ~110MB and I had to reduce the size of the training data several times until it was finally ~98MB and then Zeit deployed it ok.

I’d also be interested in zeit if someone can get it working. If there’s no solution, it seems like Google’s App Engine may be the way to go.

I’ve also experienced some issues following the deployment tutorial:


Any ideas how to fix it?

1 Like

the toy example is great! thanks for sharing - any tips on how to modify Jeremy’s tutorial to serve a regression model (e.g. Rossman) through Now?

1 Like

When installing fastai using pip, I was getting the ModuleNotFoundError: No module named 'numpy.core._multiarray_umath message. Downgrading bottleneck package to v1.2.0 solved the problem. Try pip install Bottleneck==1.2.0.

1 Like

Apparently, it is also possible to deploy fastai models using AWS Sagemaker. There was a presentation about it recently at the AWS re:Invent conference (see slides here), and 2 repos connected to the talk:

Has anyone explored this option yet? If so, what was your experience with it?

3 Likes

Thank you for the reply! First of all, for all the people that are as newbies as I am, to change the version of the Bottleneck, you need to go to the requirements.txt in your zeit folder and append it with “Bottleneck==1.2.0”.
Unfortunately, that didn’t solve the problem. Please find the screen below:

After a bit of trial & error testing, I was finally able to deploy on Zeit again by updating the fastai version to 1.0.34 in the requirements.txt file (see below). At the time of this posting, the latest fastai library is 1.0.38.

Does anyone know if there’s a better way to approach the Zeit deployment issue (apps aren’t deploying on recent fastai versions, may be related to file size limitations on Zeit)?

requirements.txt file:

-f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
torch_nightly
fastai<=1.0.34
starlette
uvicorn
python-multipart
aiofiles
aiohttp

2 Likes

For issue _"ModuleNotFoundError: No module named 'numpy.core.multiarray_umath" , It has nothing to do with deployment on Zeit, AWS beanstalk or Google App Engine. You can try local docker image build and it will throw same error.

It’s a numpy related issue and associated with fast.ai internal numpy usage. So, in order to make the latest version of fast.ai (v 1.0.39) works with numpy, the requirement.txt file needs to be updated, I have updated same on my own starter packs and will be submitting starter pack to course-v3 repo too.

I have tested my starter packs on AWS Beanstalk with t3.medium instance and they work fine, will try to test Azure Website for Container Services and Google App Engine too later.

For Zeit deployment, I am not sure, I want to continue using their services, as Zeit has recently put a lot of changes to restrict 100 MB limits. I was not able to resolve 100 MB limit issue even when I have unlimited paid plan and even when I try to explicitly use version 1 of the platform via CLI. They are definitely missing huge business opportunities here.

In the meanwhile, try changing these lines in requirement.txt file exactly in the same order for your apps and let me know if you still face issues.

requirements.txt file:

numpy==1.16.0rc1
fastai
starlette
uvicorn
python-multipart
aiofiles
aiohttp

Hi Dave,

As I mentioned in my above post, I am not sure Zeit is good options anymore for deployments for DL models with Docker images because of 100 MB limits.

However, AWS Beanstalk and Google App Engine should not be any issues as they don’t have 100 MB limits. They are enterprise production level services, so 100 MB size limit is not possible.

There are other issues with AWS Beanstalk for example, sometime using smaller ec2 instance create problems (or uploading starter pack zip directly without compressing at root level) or missing any detail exactly mentioned in AWS Beanstalk creating environment as mentioned in my blog post or guide.
All these documentation are not perfect and deployment are never easy steps, as packages keep updating and things keep breaking.
So, Try following them again patiently at high level and keep trying, let me know, if you face further errors.

Thanks Pankaj! I appreciate your HUGE help in putting together the production guides as alternatives to Zeit. As someone who’s never created a web app prior to this course, I love being able to show friends/family the “fruits of my labor” so to speak by deploying an app so thank you.

I imagine each service has its’ pros/cons and given how frequently libraries change, documentation is especially difficult. I’ll let you know if I face further errors with any of these services and thanks again.

1 Like

Hi everyone,

Thanks a lot for this thread, I was wondering if anyone had the chance to test the JIT from pytorch in order to convert a final ulmfit model to Torch Script via Tracing as in this tutorial ? Some weight sharing forbid it :
TracedModules don't support parameter sharing between modules

4 Likes

If any of you are struggling with cloud providers for deployment, I’d love for you to try Render . The official guide for fastai-v3 is here: https://course-v3.fast.ai/deployment_render.html

We don’t have any size restrictions on Docker images, and I’m around to answer questions and help with debugging. (I’m the founder/CEO of Render and previously built Crestle).

3 Likes

Thanks, for sharing Anurag, “Render” looks promising at first glance, Let me try to deploy few quick apps on web services and then will let you know the feedback.