Following lesson 7, I trained a U-net that I want to deploy on google’s cloud functions for users to upload their photos and run.
I want to minimise compute and memory and need some advice.
Currently every time a user calls the function, a data loader is created. The architecture is downloaded and then the unet_learner is created using the data and architecture.
learn = unet_learner(data, arch, loss_func=F.l1_loss, blur=True, norm_type=NormType.Weight)
only after all that my saved model weights are loaded and then I can run my inference.
I’m using a serverless product so I’m unable to save anything between user calls. Obviously this is very expensive in storage and time consuming to run. My saved model weights itself are around 500 mb <- is this normal?
Ideally I’d want to be able to save the entire model with architecture and weights so that I can run the model as a single call on single image input. Are there any guides for doing this? Or suggestions for how it could be done?
I’m also trying to minimise the script’s dependencies but haven’t found a good way to do that since the environment has to be created every call as well.
Perhaps a server would be beneficial for my project but would still like to know about making a more compact production build.
If you want to deploy a fastai model with minimal dependencies, I would suggest moving everything completely to PyTorch.
Also, saving the model with
learn.save saves additional optimizer information so either pass in the argument
with_opt=False or save using PyTorch (
The main issue is that you would have to reverse engineer the model definition that you are using for
unet_learner. You might have to replicate some of the DynamicUnet fastai code in your codebase.
With the plain PyTorch model, then the last question is how the image is loaded and passed to the model. Just make sure to check the fastai transforms and implement things like Normalization or whatever other steps happen to convert the image into a tensor to pass into the model.
The last thing to note is that in your environment make sure to only install CPU version of PyTorch. That also saves a lot of time and space during deployment.
Of course, there is a lot of hassle involved here, but if you really need the most minimal dependencies this may be the easiest approach.
Thanks for the reply, the advice for installing a CPU version is great since Cloud Functions also only provides CPU and throws errors if you install CUDA verisons. I did a lot more reading and found a comment from Jeremy that said that the fastai dependency shouldn’t be more than a few mbs once you remove spacey (~1gb) which is only used for fastai’s text module. So I think I will get away with sticking to fast ai although I might attempt to follow your advice to try to implement a minimal pytorch only version if my application needs to scale. For anyone reading, the fastai installation page https://fastai1.fast.ai/install.html details how to install only the core and vision module required for my application (saving on the text dependencies). I couldn’t find a way to do the custom install via pip so will probably zip and upload the file directly to cloud functions. Also to save on loading the model and weights, I am now using learn.export() which exports the model + weights, and can then be loaded directly using load_learner(). Make sure when you export you don’t do any transforms as I believe the transforms applied to your validation set is applied to images that you infer when you use load_learner.