Fastai is too decoupled from PyTorch

arnau · November 19, 2020, 10:26am

@muellerzr, I have been playing a little bit with the fastai library in the last months. One thing I have realized is that the logic of fastai does not allow to export many things to PyTorch. So if I want to use Fastai to train the model, seems that I need to stick using fastai for inference as well—which might not be so convenient.

One clear example is the Learner class. It is a useful class for training, but in fastai it is also used for prediction. What if I want to send my trained model to my friend who just knows about PyTorch and not about fastai? This seems quite hard to do at the moment.

Besides all the preprocessing transformations on the input data are stored in the datasets/dataloaders and not in the model itself. So If I save the model with torch.save(learn.model, 'model.pth') I still can not use it because there is some preprocessing of the data which is stored in the datasets instances.

In my opinion Learner is a class which should be used only for training and then save the model separated from the Learner. In that way I can send my model to other people who might use PyTorch and not Fastai. Besides there are so many useless objects in the Learner class at inference time:

optimizer,
dataloaders,
etc.

We do not need all this for inference.

Do you plan on changing this? Anyone else encountered the same issues?

muellerzr · November 19, 2020, 3:41pm

No, that is intentional. When comparing fastai to pytorch you can think of it as a training framework. This entails just the fastai model. fastai has its own transforms it uses, its own training loop, etc, etc. So if someone wants to do so they will get out of it a trained model and it is up to them to make sure they have their transforms in order.

fastai has inference capabilities as it just wouldn’t make sense for it not to. It gives people the option to stick with fastai for deployment if they so choose, or if they are advanced they can go and utilize raw pytorch and stray away from the fastai library.

Optimizer is there in case it’s needed, you can do a learn.save(with_opt=False) to get the model without it.

You have a test_dl in order to preprocess your data in the fastai framework.

What you’re asking is to get rid of the separation between fastai and pytorch, but that strongly limits the usefulness of fastai. It should be up to the user to determine what direction they want to go, not the library.

Also, in general do not call me out on the forums like this. It puts me in a very awkward position as A: I do not work for fastai. I am not officially affiliated with the company nor project aside from my PR’s and community involvement, and I don’t control the overarching design. This is Jeremy, not me. And B: while I did make comments, you can see they are general. It is not my place to put assumptions on library design nor discuss them from an ideological point of view. It’s Jeremy’s project and Jeremy’s vision.

arnau · November 19, 2020, 8:25pm

Ok, so sorry was not my intention to be disrespectful. For some reason I thought you were one of the people involved in the project.

Again apologies for the inconvenience.

utkb · November 20, 2020, 9:29am

Hi Arnau,

Easy misunderstanding to have, given how proactive and passionate @muellerzr is on DL and fastai ; )

Re: what you mentioned, my two cents – if you can already see a valuable path forward, in terms of making it easier and more seamless to export a fastai Learner for use and inference in vanilla PyTorch, I would suggest trying to code it up as an addon/extension library for fastai? Examples: https://github.com/nestordemeure/fastai-extensions-repository

That way, the core fastai library still maintains the philosophy and architecture envisioned by Jeremy and Sylvain, while users who might want to have a different way of exporting fastai-trained model can just import and use your extension lib?

Just a thought. Thanks.

Yijin

howtodowtle · November 20, 2020, 7:12pm

Ingoring the Learner topic and any personal tags and just focusing on the thread title, I, too, think that fastai is too decoupled from raw PyTorch and is not being used by many practitioners because of this.

There are so many gems in the library that the dev team can be proud of but it seems difficult to impossible to use them with a raw PyTorch framework.

It would be great if the fastai library could be used like other high-level libraries as sklearn or numpy where you pick the thing you want to use, import it, and use it.

Example:

I tried to use the warp augmentation. So first I located it: fastai.vision.augment.warp and then tried to apply it to an image in my training loop (a regular torch.Tensor). This does not work, and even begrudgingly trying to pick extra classes (fastai.torch_core.TensorImage, fastai.vision.core.PILImage) to make it work fails. fastai might be modular in itself but it’s not modular at all in a way that you could pick parts and really use it outside of an all fastai training environment.

P.S. If there is an easy way to cherry-pick a fastai.vision.augment augmentation, I would love to hear it and repent.

muellerzr · November 20, 2020, 7:26pm

I do agree, and it’s starting to get there. The tabular module is a good example of this, it can be used with RF and XGBoost really easily.

The main reason for this is because fastai uses a type-dispatch system to apply everything, hence the need for the wrappers (I don’t deny that perhaps that’s too much hand-holding and limitations). I’ve faced this headache myself when I tried making fastinference with a pytorch-only option. It was a bit overkill trying to recreate the transforms I want to use because to write them in a non-typedispatch way required me to essentially rewrite everything.

I really don’t know what the solution here would be outside of monkey-patching the specific transforms you want so they can use raw tensors.

Also, apologies for my flamboyant response. This is actually a good discussion and topic because it’s absolutely important. Right now fastai has examples of going from pytorch → fastai, but we don’t have the other way around.

Let’s take your warp example @howtodowtle. Warp needs a type dispatch so you’d need to make your x’s turn into TensorPoints, which isn’t very fun or exciting. One potential fix would be the following adjustment to your personal code (as if this were in the library it would break all the separate dispatching capabilities):

@patch
@delegates(_WarpCoord.__init__)
def warp(x:(torch.Tensor), size=None, mode='bilinear',
         pad_mode=PadMode.Reflection, align_corners=True, **kwargs):
    x0,mode,pad_mode = _get_default(x, mode, pad_mode)
    coord_tfm = _WarpCoord(**kwargs)
    coord_tfm.before_call(x0)
    return x.affine_coord(coord_tfm=coord_tfm, sz=size, mode=mode, pad_mode=pad_mode, align_corners=align_corners)

Which again, cool that’s all fine and dandy but that still requires knowing where that code is and modifying it…

So in short, I’m not too sure what the solution would be here. As I mentioned in my previous post fastai from a pytorch perspective is all about getting a trained model, and then you can move forward from there with basic transforms (Resize and Normalizing, etc). It would be good to hear from folks that main pytorch where the adaptations could be made easier and what exactly they are hoping to get out of fastai

howtodowtle · November 20, 2020, 10:02pm

Thanks for the quick reply, @muellerzr!

It’s hard to cherry-pick stuff from fastai because it uses soooooo many custom tools. - I am sure these are super nice and love learning about these. But it basically makes it impossible to use any fastai functionality in isolation. When you do dig deep enough, you learn a lot but basically have to rewrite everything anyway. But I am sure you know best from your fastinference experience.

Which again, cool that’s all fine and dandy but that still requires knowing where that code is and modifying it…

Yup.

And even with your snippet (Thanks! ), I still need to (at least) find and import patch, delegates, _WarpCoord, _get_default, … - And probably these will have even more dependencies. So I don’t think that’s worth it over using another library or writing the transform myself.

muellerzr · November 20, 2020, 10:08pm

patch and delegates come from fastcore (a very very lightweight library with zero dependancies). If you really wanted to you can get rid of the delegates and just use patch.

I don’t disagree here

To me it would be if you wanted to use it in PyTorch, fastai for DataLoaders and then afterwards use raw torch. Jeremy is working on segmenting everything to reduce the size needed to actually have fastai, so we have vision, tabular, text, etc so perhaps that may help?

muellerzr · November 20, 2020, 10:10pm

Yeah, which also I think comes from what people think fastai should be rather than what it is. It’s not an “add-on” to PyTorch (as perhaps it once was). It’s a fully integrated system just like TF and PyTorch in of themselves. It takes some liberties of pytorch (essentially Tensors and barely some stuff about the dataloaders, and of course everything relating to training your model) but it’s its own library at this point

That being said, this was not the case in the last version. You absolutely could just copy/paste the transforms over and everything would work fine.

So what happened?

Jeremy decided to hack and improve/utilize Python the way it “should” (argued by some) be utilized as. This means type-dispatching, etc, etc. As a result, rather than thinking of fastai being built on top of Pytorch (how it was in version 1), it’s now fastai built on fastcore, Jeremy’s python library, which then we utilize PyTorch for everything outside of our data, ie training, models, optimizers. Does this make a bit more sense as to why this disconnect happened and the radical shift @howtodowtle and @arnau?

I’ll also add, the tensor types are actually a part of PyTorch now natively too (or the capability to do so)

howtodowtle · November 20, 2020, 10:42pm

Absolutely it makes sense and I was aware of that (though I couldn’t have explained it as well as you). I just believe more people would use fastai in production if it wasn’t that disconnected.

But that’s of course up to the devs to decide and I still love to follow the course and get some ideas that I can implement myself.

Two side notes:

I am sure writing the library the (what one thinks is the) right way from the ground up must have been very satisfying and might inspire future usage of some fastcore principles.
I think you’re selling PyTorch massively short here. Isn’t the whole modeling basically still nn.Modules? Without that and autograd and PyTorch tensors all those nice things, no learning will happen.

muellerzr · November 20, 2020, 10:45pm

Yes I am I think it would have been better stated as fastai has its own way to process the data and it’s own data pipeline, while using PyTorch just for training. How’s that sound (edited it to be a bit more forgiving…)

muellerzr · November 20, 2020, 10:46pm

And honestly, it depends I guess. There are a few places that do, but you’d need to know how to optimize fastai to really get to that point (why I built fastinference). So I hear your point.

howtodowtle · November 21, 2020, 4:40pm

fastai has its own way to process the data and it’s own data pipeline, while using PyTorch just for training.

That’s a good description.