Running, Deploying Fastai successes: Links, Tutorials, Stories

I’m interested in learning ways fastai models can be deployed and run on various (edge) systems. Maybe this will be useful for many and turn into a wiki. I have properly done NONE of these, but I’m starting by collating under headings: forum posts (may link to repos and blog posts) and external links

Why at the Edge?
Your smartspeaker sits there using relatively little power running a ‘cheap’ microcontroller, locally inferring if the wake work is spoken (Alexa, Siri, Hey Google, Jeremy… if you’ve customized it). Your phone can help with text entry as you type, not a few seconds later. Low latency, lower data transfer (lower power) and increased privacy are just some of the reasons why neural net at the edge is of interest.


  1. Fastai -> Pytorch -> ONNX
  2. Fastai -> Browser
  3. Fastai -> IOS
  4. Fastai -> Android
  5. Fastai -> Jetsons
  6. Fastai -> Microcontrollers
  7. Fastai -> ONNX elsewhere

Pytorch to ONNX

Fastai is a library built on Pytorch that contains lots of framework, tips and tricks for quickly and flexibly building and training models. The notebooks regularly run predictions or batch inference, but this is not the end environment where many models intend to be deployed.

What is ONNX? Open Neural Network Exchange
Website | GitHub | Wikipedia

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers

In other words, ONNX is a shareable format for neural network models usable by

  1. Frameworks e.g. Pytorch, Tensorflow can export models to ONNX
  2. Hardware / developer environments e.g. mobile phones (android, iphone), TPUs, Jetsons etc
  3. Languages e.g. Python, Java, C, C++, C# - APIs can call ONNX runtimes / API even if not every framework is supported directly

fastai tips/tricks/issues?

  • Its important to realise ONNX only shares the model. If your pipeline relies on input transformations, these transformations will need to be ported to the deployed environment. e.g. this forum post
  • is flatten() still an issue? Developer chat

Pytorch to browser

Is tensorflow.js worth mentioning?
Forum Posts

Pytorch to IOS

Forum Posts

Pytorch to Android

Forum Posts

Pytorch to Jetsons

One thing to note with the Jetson is the Tx2 and Nano (and presumably the others) run on a AArch64 architecture which is not supported by Conda. So you want to be familiar with virtualenv and linux to use them
nano getting started | course
nvidia deep learning frameworks
Forum Posts
I havent found a recent forum post - don’t follow old ones!

Pytorch to Microcontrollers

It feels naughty… but TF lite (TensorFlow lite) is the only framework I know of.
But there’s a podcast of this being done on an Integer only ARM M0
Forum Posts

Pytorch to ONNX elsewhere | Blog

Forum Posts


community - please add suggestions in the comments. As indicated, happy and hoping this could be a wiki for you to edit

Best resource for Custom Edge Inference in iOS would be:

Tips on using Pytorch -> ONNX -> CoreML as well as creating custom converters, plus optimisations on CoreML / iOS side.


I created a Google Colab sample for converting CycleGan generator to CoreML via Onnx.


i needed to train document image classification at work. so i used the fastai library to train the model, then export it using “jit”. (i didn’t install the fastai library in our deploy container cause it’s really heavy and have too much dependency).
for some reason there were a gap between the results i got when i loaded the image’s using fastai method’s, and when i did it manually.
after some investigation i found that the gap was between the “resize” function i used (from PIL library) and the way fastai method (using tensors).

1 Like

thanks for this. I recall now that Jeremy at some point said Jit was unusable, but I did see him on Twitter later say he’d given it another go and got speed benefits. I’d forgotten

To include it as a ‘help to deploy’ information, what would we need here?
Torch Jit
torch.jit can be used to turn Pytorch/Fastai/Python code into serialized (faster) code that can be called away from python and other dependencies (e.g. in a C++ program)
… ???

I used Intel OpenVINO for faster inference of my mobilenetv2 on Intel CPU. Reduced my inference time from 200 ms (using naive learn.predict) to 25 ms! Best part, my model was still in FP32.


Look at how the fastai2 Mish activation function is done probably for a hint at an example. (It uses Jit)

For running inference on your GPU from C++ applications I would suggest using OpenCV once you have exported your ONNX model from Pytorch (OpenCV 4.2.0 has CUDA DNN backend).

turns out the fastbook draft chapter 2 talks about basic deployment using voila from jupyter and binder for free hosting


I’m pretty sure we can do this fastai esque in less than 3 years without all having phds :slight_smile:

I had the same issue. Use torchvision.transforms.functional.resize, that gave me identical predictions as when using fastai’s resize

1 Like

I just discovered that torchvision's resize is the same as PIL.resize((size), resample=2).

resample=2 resizes using “bilinear” interpolation whereas by default, resample is set to 3, which is “nearest” interpolation.