[Solved] Using a fastai-trained model with plain Pytorch

I’m trying to use one of my fastai-trained models on a machine which has Pytorch (1.0.0) installed.

I cannot install anything on that machine (and in particular, I cannot install fastai).
Now, if I save the model by:

modelname = learnername.model
modelname.cpu()
torch.save(modelname, 'picklename.pkl')

I am able to load it with plain Pytorch by torch.load() on my development machine which has fastai installed.

But as I try to use torch.load() on that inference machine which has NOT fastai installed, it answers no module named fastai.
This is strange, since once you export the model with torch.save() it should have become a plain Pytorch model…

Any idea about how to get it working?

Thanks!

1 Like

No exporting with torch.save only saves the weight, not the actual model. You ave to redefine your model in plain PyTorch to be able to load those weights.

3 Likes

Thanks.

Ok, tell me just if I understood it right. I got to load a pretrained model from torchvision.models and then adding a custom head identical to fastai’s by using nn.Sequential(). Am I right?

2 Likes

If that is your model, then yes.

1 Like

Another way is to pickle the model and ensure any bespoke layers can be imported when you load it. I have a model trained using fastai. However production is in a docker container and fastai takes a gb or so which is not needed at predict time. I copied any fastai specific layers such as flatten into a project module so they are found by the pickle load.

2 Likes

Thanks.
May you elaborate a bit further? I’m not sure I’m understanding it right. Maybe a brief example could help!

actually just rechecked and my load is:

torch.load(modelpath, map_location=lambda storage, loc: storage, pickle_module=dill)

cant remember why but dill was needed.

fastai shouldn’t take more than a few MB - I’m guessing the issue is that you’re getting spacy and lots of its models too. You can remove spacy and its deps if you don’t use fastai.text.

2 Likes

Sometimes pickle is not able to serialize/deserialize some model creations functions, so one uses dill.

However I still don’t catch your advice conceptually. In particular:

How did you do it? Is it the map_location parameter?

This is valuable advice in cases like @simoneva’s, but in my case it is not possible for me to install anything upon the inference machine. It would be great to have a specific method that copies the fastai model structure in a corresponding pytorch model. I’m navigating through Pytorch docs right now, but doing this turns out to be a bit more difficult than I thought…

2 Likes

OK as jeremy suggests simplest may be to remove spacy. probably just install fastai then uninstall spacy in single step if it is a dockerfile to minimise space.

I had model from torchvision but a few bespoke layers. I replaced any references to fastai to model.py and copied the source code.

from torch import Tensor
from torch import nn
import logging as log

class Lambda(nn.Module):
    "An easy way to create a pytorch layer for a simple `func`."
    def __init__(self, func):
        "create a layer that simply calls `func` with `x`"
        super().__init__()
        self.func=func

    def forward(self, x): return self.func(x)

def Flatten(full:bool=False)->Tensor:
    "Flatten `x` to a single dimension, often used at the end of a model. `full` for rank-1 tensor"
    func = (lambda x: x.view(-1)) if full else (lambda x: x.view(x.size(0), -1))
    return Lambda(func)

def head(nf):
    return \
        nn.Sequential(
            Flatten(),
            nn.Linear(nf, 128),
            nn.ReLU(True),
            nn.BatchNorm1d(128),
            nn.Linear(128, 256),
            nn.ReLU(True),
            nn.BatchNorm1d(256),
            nn.Linear(256, 2)
        )
1 Like

the map location is needed if you save a cuda version of model after training on gpu; then try to run it on cpu. it won’t load as no gpu available.

Ok @simoneva, thanks for your reply.

Let’s see if we can go through the entire process.

The code you quoted above is a small subset of layers.py. If I’m not making mistakes, that should be the bare minimum needed to add the custom head a-la-fastai.

So, you instantiated a standard resnet from torchvision.models, and then added the head by calling head(). Did you append the head at the end of children list and then unpacked all of it (like I did in the example below)?

Then, you just loaded the weights in .pkl with torch.load(), correct?

I tried something much more trivial, that is:

mymodel=learn.model
modules=list(mymodel.children())
my_r50=nn.Sequential(*modules)

Thought that just recreating the fastai model via nn.Sequential would have given me a pure Pytorch model.
I was wrong.

OK I created learner as standard torchvision model with bespoke head which is in the file model.py so no references to fastai.

    learn = create_cnn(db,
                       arch=torchvision.models.vgg16_bn,
                       metrics=accuracy,
                       custom_head=model.head(nf),
                       callback_fns=[ShowGraph, partial(GradientClipping, clip=0.1), BnFreeze])

Did all the training using fastai. Then:

torch.save(learn.model, "local_vgg16_bn", pickle_module=dill)

When the model is loaded it looks for the head in model.py not fastai. Of course I could have copied the whole model from torchvision. In some ways that would be better if torchvision changes. However I had various models under test so was easier to do it this way.

Reason for not using standard fastai head was I had very long thin images which were too small for the pooling layer. With hindsight I probably could have just resized the images and used standard head. However once I got it working was extra work to change.

1 Like

your children includes the head which probably has a flatten or adaptivepooling layer defined by fastai which is not present. the unpickle will look for fastai.layers which is not installed. best not to add a fastai.layers module to your project as this will hide the real fastai. hence solution is to copy the offending layers to a separate module and use bespoke head.

1 Like

Ah ok! So the code listing above is your own model.py. I thought you were making reference to fastai’s model.py.

Well then, I’ll do some experiments and let you know! Thanks! :slight_smile:

Yes you have to find any fastai references and replace them with something that is available at prediction time.

1 Like

One other thing to watch is if you are using other aspects of fastai such as data transformations or databunches that read from folders. Did not apply in my case as my data was already formatted for prediction and when I first wrote it fastai did not have a lot of those things.

I think going forward easiest is to just use fastai at prediction stage and remove spacy. If you can’t install it for some reason then include it in your package.

I can’t. It is not a docker container that I can concoct as I like. It is an inference machine upon which I can just use the stuff already installed.

Besides, you are teaching me things which are interesting per se. :slight_smile:

Look at what happens. I slightly changed your code to adjust it to my needs.


from torch import Tensor
from torch import nn
import logging as log

class Lambda(nn.Module):
    "An easy way to create a pytorch layer for a simple `func`."
    def __init__(self, func):
        "create a layer that simply calls `func` with `x`"
        super().__init__()
        self.func=func

    def forward(self, x): return self.func(x)

def Flatten(full:bool=False)->Tensor:
    "Flatten `x` to a single dimension, often used at the end of a model. `full` for rank-1 tensor"
    func = (lambda x: x.view(-1)) if full else (lambda x: x.view(x.size(0), -1))
    return Lambda(func)

def myhead(nf, nc):
    return \
        nn.Sequential(        
            nn.Sequential(
                nn.AdaptiveAvgPool2d(1),
                nn.AdaptiveMaxPool2d(1)
                ),
            Flatten(),
            nn.BatchNorm1d(nf),
            nn.Linear(nf, 512),
            nn.ReLU(True),
            nn.BatchNorm1d(512),
            nn.Linear(512, nc),
        )

Note that I didn’t make it a python module. I just wrote that in a notebook cell, for experimenting. nc is the number of classes.

Then, I did:

import torchvision.models
mylearn=create_cnn(data,arch=torchvision.models.resnet50,
                   metrics=accuracy,
                   custom_head=myhead(4096, 3))

That created a resnet50 with a head identical to fastai’s.

Then:

modeltosave=mylearn.model
modeltosave.cpu()
torch.save(modeltosave, '/path/mymodel.pkl')

As you warned, it didn’t work: AttributeError: Can't pickle local object 'Flatten.<locals>.<lambda>'.
But it serializes the fastai’s Flatten which is identical to ours, so I cannot figure out why it doesn’t work for our Flatten (maybe @sgugger could answer this).

However, I installed dill, and then:

import dill
modeltosave=mylearn.model
modeltosave.cpu()
torch.save(modeltosave, '/path/mymodel.pkl', pickle_module=dill)

I received a warning: serialization.py:251: UserWarning: Couldn't retrieve source code for container of type Lambda. It won't be checked for correctness upon loading. "type " + obj.__name__ + ". It won't be checked "

And indeed, at inference time, it says:

path/site-packages/dill/_dill.py", line 474, in find_class
    return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'Lambda' on <module '__main__' from 'predictor.py'>

Mmhh… It seems it cannot serialize it.
Any suggestion? :thinking:

1 Like

It doesn’t look the same at all - ours doesn’t use lambda!

1 Like

Has to be in a module. When you unpickle it has to have the same name and be importable.

1 Like