Resnet34 to onnx for the MNIST dataset

draz · May 22, 2018, 11:16pm

Based on the Lesson 1 code, I want to use the pretrained resnet34 over the MNIST dataset to convert it into ONNX. I try the following without success. My code is here:

from fastai.imports import *
from fastai.transforms import *
from fastai.conv_learner import *
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *

PATH = "data/mydata/"   # supposed to be the MNIST
sz=224

torch.cuda.is_available()
torch.backends.cudnn.enabled

arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 2) 

from torch.autograd import Variable
import torch.onnx
import torchvision

Model = learn.model
dummy_input = Variable(torch.randn(1, 3, sz, sz))

torch_out = torch.onnx._export(
                  Model
                  , dummy_input
                  , "Model__v00.onnx"
                  , verbose=True
                  , export_params=True
        )

The bottom error I get is:

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py in _check_input_dim(self, input)
121 if input.dim() != 2 and input.dim() != 3:
122 raise ValueError(‘expected 2D or 3D input (got {}D input)’
→ 123 .format(input.dim()))
124
125

ValueError: expected 2D or 3D input (got 4D input)

What do I miss?
Thank you!

draz · May 23, 2018, 7:36pm

Anyone could help?
Thanks!

draz · May 23, 2018, 10:57pm

My main questions are:
a. Is it correct to export the learn.model or any other extension is the trained model to get saved.
b. If the size of the dummy_input is not correct, what is the right size for it?
Thank you!

ramesh · May 24, 2018, 1:12am

@draz - I posted a gist for a toy example to convert to ONNX - https://gist.github.com/sampathweb/7b6fbb35095835e9f5a12136f7494197

Please note that as specified in this post (Using a fast.ai model in production), we can’t export Fast.AI models right now to ONNX because of AdaptiveMaxPool and AdaptiveAvgPool Layers. This was an issue with PyTorch 0.3, I have not tested with PyTorch 0.4

draz · May 24, 2018, 3:39am

@ramesh Thank you for your example!
Regarding the model that I train above, if I am not wrong it has no Pool layer:

learn.summary

<bound method ConvLearner.summary of Sequential(
(0): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.25)
(2): Linear(in_features=1024, out_features=512, bias=True)
(3): ReLU()
(4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.5)
(6): Linear(in_features=512, out_features=10, bias=True)
(7): LogSoftmax()
)>

ramesh · May 24, 2018, 4:08am

It has Adaptive Pooling layer…Try learn.models.model and you will see it. learn.summary only gives the top layer of layer.model

draz · May 24, 2018, 6:38am

This is correct. Thanks!

draz · May 24, 2018, 5:00pm

In this example, there is a MaxPool layer. How this ONNX conversion could work? Is the problem strictly related to the AdaptiveMaxPool and AdaptiveAvgPool layers?

Also, it seems that the current FastAI runs with PyTorch 0.4?

import torch
print(torch.version)
0.4.0

ramesh · May 25, 2018, 5:56am

Yes, the problem is strictly related to AdaptiveMaxPool and AdaptiveAvgPool because they are dynamic in size and can reduce any input h x w to any output you specify (but mostly 1x1) output.

There are a few elements in Fast.AI (particularly the NLP) that might have issues. Otherwise, the majority might work work PyTorch 0.4

draz · May 25, 2018, 2:38pm

@ramesh
Then is it correct to say that neither the Pytorch 0.4 is working to export Fast.AI to ONNX? The main question of this topic is the error I get at the very end when I use _torch.onnx.export and all this is running on a Pytorch 0.4.

My overall intention is to create a TensorRT representation of my Fast.AI models. Other than this puzzling ONNX convertion, is there any other solution that you suggest? To my understanding the solutions you suggested earlier are:

don’t use a pre-trained Fast.AI model, but train and export your own one, like the LeNet example you posted.
and/or wait until the issue with these adaptive layers get resolved.

I would add these:

Find another conversion to represent a Fast.AI model into TensorRT. Any good suggestion on this?
or, modify AdaptiveMax/AvgPool with GlobalMaxPool. If this is a solution, how can I do it? Btw, there is no globalpool-like class in torch.nn.modules.pooling.

Thank you!

PrinceP · September 27, 2018, 12:58pm

Check out the below repository by Nvidia.

Once the weights are prepared, c++ code uses nvinfer1 library of TensorRT to quantize the model.
I bench-marked the model on Jetson TX2.

The numbers for YOLOv2 on Jetson TX2 with the plugin

Network Type : yolov2Precision : kFLOAT Batch Size : 1 Inference time per image : 61.8683 ms

Network Type : yolov2Precision : kFLOAT Batch Size : 4 Inference time per image : 62.3329 ms

Network Type : yolov2Precision : kFLOAT Batch Size : 8 Inference time per image : 61.9196 ms

~16 fps on batch_size=1

Network Type : yolov3Precision : kFLOAT Batch Size : 1 Inference time per image : 151.085 ms

Network Type : yolov3Precision : kFLOAT Batch Size : 4 Inference time per image : 144.405 ms

Network Type : yolov3Precision : kFLOAT Batch Size : 8 Inference time per image : 146.234 ms

chho6822 · December 30, 2018, 2:02am

@draz

I found this post that probably can help you to solve your problem:

“Pytorch-onnx currently doesn’t support AdaptivePooling but fast.ai is using that for training on different input image sizes (a way to prevent overfitting). But if we only care about one size, let’s say 299, we have to replace the AdaptivePooling by supported Pooling layer with fixed size…”

I don’t think he is using the latest version fo fastai(v1/ Pytorch1.0), due to the date of his blog.

Hope this helps!

annoyingnerd · March 7, 2019, 9:38pm

I’m looking for a good solution to convert a few computer vision models to CoreML. What’s the best solution currently?
@draz - did you eventually get your model converted over to CoreML?

rsomani95 · January 20, 2020, 3:25am

This is the best one by far, I tried using this code with ResNets and MobileNets, works great! The only change it does to fastai is replace fastai.layers.Flatten with torch.nn.Flatten. Credit to @davidpfahler

github.com

davidpfahler/react-native-ml-app/blob/e4abc813f2c3e7e147454afbcbb4edd14c9ffe16/train_dog_classifier_with_fastai_export_to_CoreML.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "colab": {
      "name": "train_dog_classifier_with_fastai_export_to_CoreML.ipynb",
      "provenance": [],
      "collapsed_sections": [
        "mXTAuDBOlC3b",
        "nP-NfanpvEGn",
        "JlkRQreIzTPB",
        "kxG4JxzJq65A",
        "GmfqGmw7Ll7u",
        "XtUK9KPikDco",
        "owsg1EP2nKCg"
      ],

This file has been truncated. show original

jbfm · January 22, 2020, 2:49pm

I’ve also gotten @davidpfahler’s method described in @rsomani95’s post to work. If you see the cannot resolve operator 'Shape' with opsets: ai.onnx v9 error, that means you didn’t correctly re-write the head of the model to replace Fast.ai’s custom Flatten layer with the PyTorch one. (See @davidpfahler’s notebook for details.)