I’m interested in learning ways fastai models can be deployed and run on various (edge) systems. Maybe this will be useful for many and turn into a wiki. I have properly done NONE of these, but I’m starting by collating under headings: forum posts (may link to repos and blog posts) and external links
Why at the Edge?
Your smartspeaker sits there using relatively little power running a ‘cheap’ microcontroller, locally inferring if the wake work is spoken (Alexa, Siri, Hey Google, Jeremy… if you’ve customized it). Your phone can help with text entry as you type, not a few seconds later. Low latency, lower data transfer (lower power) and increased privacy are just some of the reasons why neural net at the edge is of interest.
- Fastai -> Pytorch -> ONNX
- Fastai -> Browser
- Fastai -> IOS
- Fastai -> Android
- Fastai -> Jetsons
- Fastai -> Microcontrollers
- Fastai -> ONNX elsewhere
Pytorch to ONNX
Fastai is a library built on Pytorch that contains lots of framework, tips and tricks for quickly and flexibly building and training models. The notebooks regularly run predictions or batch inference, but this is not the end environment where many models intend to be deployed.
ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers
In other words, ONNX is a shareable format for neural network models usable by
- Frameworks e.g. Pytorch, Tensorflow can export models to ONNX
- Hardware / developer environments e.g. mobile phones (android, iphone), TPUs, Jetsons etc
- Languages e.g. Python, Java, C, C++, C# - APIs can call ONNX runtimes / API even if not every framework is supported directly
- Its important to realise ONNX only shares the model. If your pipeline relies on input transformations, these transformations will need to be ported to the deployed environment. e.g. this forum post
- is flatten() still an issue? Developer chat
Pytorch to browser
Is tensorflow.js worth mentioning?
Pytorch to IOS
Pytorch to Android
Pytorch to Jetsons
One thing to note with the Jetson is the Tx2 and Nano (and presumably the others) run on a AArch64 architecture which is not supported by Conda. So you want to be familiar with virtualenv and linux to use them
nano getting started | course
nvidia deep learning frameworks
I havent found a recent forum post - don’t follow old ones!
Pytorch to Microcontrollers
Pytorch to ONNX elsewhere