AWS Lambda deployment for Object Detection

jorge_orozco · October 9, 2019, 12:55pm

Dear community,

I’m working with a friend trying to follow the fast.ai tutorial for AWS deployment but instead of image classification we are implementing object detection. Unfortunately, we are stuck in the last step:

In this tutorial, fast.ai is only used to build the model. For the prediction inside AWS, the code calls some PyTorch functions to load the model and then obtain the image class. In this case it is sufficient to load all the libraries from a public Lambda Layer that contains PyTorch, but does NOT contain fast.ai.

Now, the object detection functions that Jeremy wrote need to call some fast.ai functions. This means that we have to load the library into the Lambda instance. Our first approach was to to upload the fast.ai library as part of the execution code, but we couldn’t isolate it from the Pytorch Layer when building the project. We uploaded the whole package (which contained both fast.ai and Pytorch) but it was too heavy for the Lambda instance to load.

Then we tried to create a Layer ourselves that contained fast.ai and PyTorch. When deployed, it weights 640 MB, more than the 500 MB allowed (see “Q: What if I need scratch space on disk for my AWS Lambda function?”). When running a test, the logs show this error: “module initialization error: [Errno 28] No space left on device”.

The next step is to try to manually separate the fast.ai dependencies and upload some of them in the Lambda Layer, and some as part of the execution code, so as to divide the load of the libraries into different folders. We’re afraid that this will take too much time to achieve, and we’re not even sure that it will actually work.

We have managed to run our code locally using Docker, so we are sure that the program runs. This issue is the only thing holding us back. Any question and/or suggestion is very welcome!

TomB · October 9, 2019, 3:10pm

Did you just install all the fastai dependencies? I’d first look to just remove some of the things you are unlikely to need. In conda you e.g. conda remove --force spacy (spacy is 362Mb in my environment and unlikely you need it for object detection, so that’d be my first try). There doesn’t seem to be such an option in pip, not sure if it will just not complain. Or, probably better for reliable distribution as well, would be to explicitly install the direct fastai dependencies you need (pulling in sub-dependencies) and then install fastai with pip install --no-deps fastai.

A quick du -d 3 -t 20M .my-env/lib/python3.7/site-packages | sort -k 1rh quickly showed things to look at. Or https://medium.com/@mojodna/slimming-down-lambda-deployment-zips-b3f6083a1dff has a nice, if more extreme, method assuming you can easily trigger all the things you’ll do in deployment (and that you don’t have a filesystem mounted with noatime).

jorge_orozco · October 9, 2019, 3:40pm

Did you just install all the fastai dependencies?

I think I did… the problem is that I install fastai inside the Lambda instance through a “requirements.txt” file, in which I specify the name of the library and it creates the correct deployment environment for the function. I don’t know how to install only some dependencies using this method.

But still, I will build from scratch with the advice that you gave me and try to imitate this environment. Hopefully it will work. I’ll later report the outcome!

TomB · October 9, 2019, 3:58pm

Haven’t used Lambda so not much help there. I saw information on creating a layer from a virtual environment (looked like you basically just zip up site-packages). Or, basically equivalently, you pip install --target=<some-dir> (I think it was, check the AWS/pip docs) and then zip that up (or use a tool that zips it and adds metadata, again check the docs). The --target giving pip an alternate site-packages folder to use.
You should be able to just append the --target directory to your python path, just stick it in the PYTHONPATH environment variable (or PYTHON_PATH but I think the first, a quick search will confirm). Or alternately in ipython/jupyter you can just import sys; sys.path.append('some-dir'). Then python will also load packages from there. That way you can install PyTorch and stuff in existing layers in one place and the custom layer stuff in another and fairly easily test and add things as needed.
You could just pip install --no-deps fastai and then add things one at a time until it works. Though you probably want to install many of the small things together or that could take you awhile.

jorge_orozco · October 9, 2019, 4:17pm

This is super helpful, I will definitely try it!

Also, I found a way to install only the needed dependencies for certain functionality in the installation tutorial

aolchik · April 16, 2020, 10:41am

Hi Jorge, please, let us know if you did achieve it!