Deployment Platform: AWS Lambda

Tchotchke · March 3, 2020, 1:40pm

A while ago (maybe a year back) I know it didn’t work for text models. Haven’t looked at it since.

justin.tomlinson · March 3, 2020, 10:44pm

Thanks for the reply Matt.

I suspected it didn’t as I saw some discussions on the Pytorch forums but I didn’t really understand the reasons they were giving so wasn’t sure if it was an issue for me or not. I ended up starting with the Render starter repo then scraping the UI part and writing a small API with FastAPI.

For anyone else who primarily needs a backend web-service rather than a fancy UI, I can recommend this approach as FastAPI gives nice Swagger interface with little effort. Makes testing the API endpoints easy.

Tchotchke · March 4, 2020, 2:59pm

Interesting to hear you’re using FastAPI - that’s what we (@aychang) opted for when we created AdaptNLP, which is our fastai-inspired wrapper for Hugging Face and Flair.

tutti_frutti · March 7, 2020, 1:48am

Hi everybody,

Many thanks to @matt.mcclean for the guide, it helped a lot.
However i’m stucked. When testing locally I have an import error for “libtorch.so”.

I’m using a custom lambda layers as the links provided do not work for me.
In the “create_layer_zipfile.sh” i saw that “libtorch.so” is removed

rm ./torch/lib/libtorch.so

Not removing it creates a zip file that exceed 250Mb (AWS lambda limit), removing it gives the error mentioned above.

The only difference with matt’s code, is that I have changed the download link in the requirements to torch 1.2.0.

What am I missing?

Tomas1337 · May 7, 2020, 7:14pm

Hi, been trying to get this to work for me but having a problem when lambda tries to open the model.
Here is the log file grabbed from AWS CloudFunction and it looks like the PyTorch can’t read it properly? I trained this model using a Pytorch 1.4 and Python 3.7.
Would it make sense to downgrade to PyTorch 1.1 and reconstruct the model from there?

[INFO] 2020-05-07T17:04:57.431Z Loading model from S3

17:04:59
Model file is : res50_stage1_v4.pth
Model file is : res50_stage1_v4.pth

17:04:59
Loading PyTorch model
Loading PyTorch model

17:05:00
module initialization error: version_number <= kMaxSupportedFileFormatVersion ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:131, please report a bug to PyTorch. Attempted to read a PyTorch file with version 2, but the maximum supported version for reading is 1. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:131) frame #0: std::func

17:05:00
END RequestId: 8cafea90-765c-4a19-8ce0-8525956ad0ce

17:05:00
REPORT RequestId: 8cafea90-765c-4a19-8ce0-8525956ad0ce Duration: 149.70 ms Billed Duration: 200 ms Memory Size: 3008 MB Max Memory Used: 446 MB

17:05:00
module initialization error version_number <= kMaxSupportedFileFormatVersion ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:131, please report a bug to PyTorch. Attempted to read a PyTorch file with version 2, but the maximum supported version for reading is 1. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:131) frame #0: std::funct
module initialization error
version_number <= kMaxSupportedFileFormatVersion ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:131, please report a bug to PyTorch. Attempted to read a PyTorch file with version 2, but the maximum supported version for reading is 1. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:131)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f6d223d9441 in /tmp/sls-py-req/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f6d223d8d7a in /tmp/sls-py-req/torch/lib/libc10.so)
frame #2: caffe2::serialize::PyTorchStreamReader::init() + 0xed1 (0x7f6d23fc8431 in /tmp/sls-py-req/torch/lib/libcaffe2.so)
frame #3: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::istream*) + 0x48 (0x7f6d23fc90f8 in /tmp/sls-py-req/torch/lib/libcaffe2.so)
frame #4: torch::jit::import_ir_module(std::function<std::shared_ptrtorch::jit::script::Module (std::vector<std::string, std::allocator<std

17:05:00
/tmp/sls-py-req/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.25.8) or chardet (3.0.4) doesn’t match a supported version!
/tmp/sls-py-req/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.25.8) or chardet (3.0.4) doesn’t match a supported version!

17:05:00
RequestsDependencyWarning)
RequestsDependencyWarning)

streicher · May 11, 2020, 12:42am

I had the exact same problem and downgrading to torch 1.1.0 solved it. I think the format of the jit trace file changed between pytorch versions.

To downgrade for this export without breaking my fastai2 install, I made a new conda environment as follows:

conda create -n pytorch11 python=3.6 pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0
conda activate pytorch11
conda install -c pytorch -c fastai fastai pytorch=1.1.0 torchvision=0.3.0 cuda100 jupyter jupyterlab
conda install boto3
jupyter notebook password
jupyter notebook --ip=10.0.1.100
python -c “import torch; print(torch.version)” # this command should print 1.1.0

I then imported the old model in jupyter notebook with:
from fastai.vision import *
classes = [‘Anger’, ‘Disgust’, ‘Surprise’, ‘Sadness’, ‘Happiness’, ‘Neutral’, ‘Contempt’, ‘Fear’]
data = ImageDataBunch.single_from_classes(’’, classes, ds_tfms=None)
learner = cnn_learner(data, models.resnet34)
learner.load(‘gokul-sentiment-stage-5n’)

… and from here on I followed the description on https://course.fast.ai/deployment_aws_lambda.html from “Export your trained model and upload to S3”.

Tomas1337 · May 15, 2020, 8:21am

Thanks so much!
I’ll give that a shot. Currently to overcome this problem, I just went with Google Cloud Functions. It seems slow (800ms per inference on a resnet50) so I wanna give AWS lambda another try.

matus66 · May 18, 2020, 6:50am

Hi, I went through the guide, and I have already deployed the model to aws. When I run log command this problem occurs:

matus66 · May 18, 2020, 7:52am

I tried streicher solution. It seems there is a problem when exporting the model: trace_input = torch.ones(1,3,299,299).cuda() jit_model = torch.jit.trace(learner.model.float(), trace_input) the error is: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])

streicher · May 22, 2020, 3:58am

Hi @matus66. It may help to check if you model is running as you expect by attempting an inference before starting the jit trace. The model needs to be loaded in memory for the jit trace to map it correctly. The example at https://github.com/fastai/course-v3/blob/master/docs/production/lesson-1-export-jit.ipynb shows how to export a model that has just been trained. I wrote a small jupyter notebook to import a model that was previously trained by someone else and saved as a pth file. I attach a PDF showing the notebook run. ModelExport - Jupyter Notebook.pdf (101.9 KB)

watanabe · June 9, 2020, 2:30pm

I have a segmentation model I’m trying to deploy to Lambda, however it times out when calling the model(input) traced torchscript function.

I tried retraining the model in a pytorch 1.1 conda environment as suggested recently, however that version of pytorch is unable to trace learners with hooks, which as I understand it from the lessons, are pretty integral to implement the cross-connections of a segmentation network.

Has anyone successfully made a a pytorch 1.4 layer for AWS Lambda? Or have alternative suggestions for other things to try?

mihow · June 10, 2020, 3:26pm

@matt.mcclean Is it possible to share the build script or source you used to create the original lambda layers? I’ve trained a new model that is incompatible with the old version as well. Thanks for all your great work on this deployment method.

mihow · June 11, 2020, 11:15pm

I saw that warning too, but the actual error was a timeout right afterwards. I increased the Timeout to 120 in template.yaml and it worked locally. I’m guessing that’s only the cold startup taking so long, and it seems to be extra slow when testing locally.

watanabe · June 12, 2020, 3:32am

I believe the urllib3+chardet error gets resolved when you train+export your model using pytorch1.1. I also resolved it by adding another layer to my lambda that had current versions of those 2 libraries, but in that case the model fails to load (due to changes between 1.4 and 1.1?)

I’ve tried installing pytorch 1.4 without cuda and get a 124MB zip (which can be uploaded to S3, however it can’t be used as a layer because after extracting it is over 250MB). Lambda seems to have weird undocumented behavior where they say the ZIP cannot be over 50MB, and will block uploads over that limit, but if you give it an object already in S3, it can be over 50MB, as long as once extracted it is under 250MB.

To get around all this mess, right now I’m looking at training/exporting my models in pytorch 1.4 or 1.5, and using ONNX to export and run inference in lambda.

jtunis · July 18, 2020, 9:59pm

@mihow @watanabe I made a script for packaging specified versions of PyTorch + Python into a Lambda Layer, along with examples for deploying it with Serverless Framework and Terraform: https://github.com/JTunis/create-pytorch-lambda-layer.

I haven’t gotten to test a bunch of different versions yet, but I’m currently using it with Python3.8 and torch 1.5.1. Let me know if you have issues or suggestions!

mservantes · October 16, 2020, 5:46pm

@matt.mcclean I’ve been reading through some of work on building an AWS Lambda Layer for fastai. Where exactly did you find the publicly available PyTorch Lambda Layer ARNs? I’m trying to track down the source so I know where to look when it’s updated. Thanks in advance!

Edit: I just realized that you may be the owner/publisher of the Lambda Layer. Is this something you created or something you found? Thanks again!

matt.mcclean · December 4, 2020, 5:12pm

AWS Lambda has just announced support for Container images to package your code and dependencies. I have setup an example project using the SAM CLI here: https://github.com/mattmcclean/fastai-container-sam-app.

Container images can be up to 10GB in size getting around previous issues with the PyTorch package being too large for the lambda zip file.

Would love to hear your feedback!

ganesh.bhat · January 8, 2021, 10:08am

Thanks @matt.mcclean for creating the container approach using fastai. I wanted to understand the whole process in a summary based on what you have written on the GitHub repo, hence I am summarizing my understanding so that I can do it myself. Apologies, if some of my questions seem basic.

Before that, I am sharing what I have done followed by the steps to create the docker container.

We have created a recommendation model using Collaborative Filtering. We will export the model using learn.export()
Build and deploy your application - I am assuming that this deploys the docker container that was created using a CloudFormation template and can be accessed by an API call in the Lambda function.

Q1: I need some clarity here - With this API, can we directly call predict on the model, or do we have to first need to load the model and then call predict?
Q2: For the recommendation Model, we need to train the model daily i.e. once per day before the start of the business day as we get the new data for the previous data which may impact the recommendation output. Can this be handled automatically using the model export?
Q2.1: If the automation is not possible using docker container/lambda, is there an alternative approach to do the deployment?

Thanks in advance for your help.

Regards
Ganesh Bhat

matt.mcclean · January 11, 2021, 3:02pm

Hi there, I would recommend using the new approach to bundle the fastai libraries in a Docker container as per the example project here. Lambda layers still have a max limit of 250 MB which is too small for fastai + Pytorch libs

matt.mcclean · January 11, 2021, 3:10pm

Q1. In the example project on Github the model is loaded when the AWS Lambda function execution environment is started meaning that it is done once and the model can be called multiple times. See code snippet here showing the load_learner() function called before the lambda handler function:

learn = load_learner('export.pkl')

def lambda_handler(event, context):

Q2. If you need to automatically build your model then I would consider using something like SageMaker Training service which you can automate to build your model on a daily basis and save somewhere like S3. In the example the fastai code and model is bundled into a Docker container and pushed to the AWS Elastic Container Registry.

Q3. You could use a service like SageMaker Pipelines or AWS Step Functions to automate the entire sequence of steps to build your model, publish it then deploy.