Difference between cnn_learner and Learner?

linminhtoo · June 29, 2020, 9:55pm

Hello, I am a beginner in deep learning and just learnt the fast.ai library last week for an ongoing image classification competition. The competition features a large dataset (100k+) with many categories (~40) and a relatively dirty dataset. I was trying to explore with different pretrained models of varying complexity as well as augmentations. I managed to get it working (though only after several hours of debugging!) with a ResNext101 model from the pytorchcv model zoo, using cnn_learner and a custom_head. While I currently have a decent score on the public leaderboard, I am still far from the threshold score needed to earn full points from the organiser (it is a series of challenges). I want to do my best to get there.

Currently, even after several hours of sweating and head-scratching, I failed to add in extra augmentations that other fast.ai users have developed, found in this amazing github repo: https://github.com/oguiza/fastai_extensions
Specifically, I wanted to make use of the various blend augmentations, especially progressive sprinkles, having read posts by LessW2020 (who, by the way, is truly amazing). However, whenever I try to add the .blend(**kwargs) method to my cnn_learner, it throws: AttributeError: 'Learner' object has no attribute 'blend' , and I cannot continue. I also get a 'Learner' object has no attribute 'batch_loss_filter' when I try to use the Batch Loss Filter callback to try to accelerate my training.

While this has been extremely frustrating as googling did not give me the answers, I suspect it is because I am using cnn_learner, and not Learner, and there are some inherent differences between them, although I cannot find the documentation that exactly specifies what are the differences…only a comment or two in the forums saying something about how cnn_learner is better optimised for learning than Learner.

Could anyone explain what are the differences between these two, and if this difference will cause me a hit in accuracy, how can I change the settings of Learner() to make it essentially a replica of cnn_learner so that I can access all these cool extensions?

Thanks and cheers.

muellerzr · June 29, 2020, 10:21pm

I’d go the other way around. Take a peek inside the cnn_learner function, and you’ll see it’ll eventually call a Learner just with some setup, so just do that yourself (if you need help finding the source code do a ?? After cnn_learner)

linminhtoo · June 29, 2020, 11:41pm

Hey Zachary, thanks for your answer. I did read the documentation before for cnn_learner, and have just read it again. If I am using my own custom_head, I don’t see how cnn_learner is that much different from Learner, maybe with the exception of being able to set pretrained=True if needed. Is there really no ‘hidden’ difference? Which means, I can just run something like: learn = Learner(data, base_arch=sys.modules['torchvision.models'].__dict__['resnext101_32x8d'], pretrained=False, cut=-2, split_on=lambda m: (m[0][6], m[1]), metrics=[accuracy, precision], custom_head=custom_head, loss_func=loss_func).mixup().to_fp16().blend(**kwargs) ?

linminhtoo · July 1, 2020, 9:57am

Hi, I just tried this and it doesn’t work. It gives me __init__() got an unexpected keyword argument ‘base_arch’` Please, help me out here. Thanks.

EDIT:
I managed to get it working by copying the source code from cnn_learner?? However, I still face the exact same AttributeError: 'Learner' object has no attribute 'batch_loss_filter' I am lost for words now. This is exactly what the author of BatchLossFilter does in his github repo, and I can’t replicate it.
(repo is https://github.com/oguiza/fastai_extensions/blob/master/03_BatchLossFilter.ipynb)

What I am doing:
precision = Precision()
from fastai.vision.learner import cnn_config
meta = cnn_config(base_arch)
model = create_cnn_model(sys.modules['torchvision.models'].__dict__['resnext101_32x8d'], data.c, cut=-2, lin_ftrs=[512,512], ps=0.3, pretrained=True, concat_pool=True)
learn = Learner(data, model, loss_func=loss_func, metrics=[accuracy, precision]).batch_loss_filter(min_loss_perc=.9).to_fp16()
learn.split(split_on=lambda m: (m[0][6], m[1]))
learn.freeze()
apply_init(model[1], nn.init.kaiming_normal_)

Am I missing out something with the version of fastai??? I am using Google Colab, and it says my fastai version is 1.0.61. I also did !pip install “torch==1.4.0” “torchvision==0.5.0”, but this was based on my teammate’s notebook, so I’m not sure if that affects this. It seems my Learner has no attributes .show_tfms() as well, which I really do not understand.

muellerzr · July 1, 2020, 10:41am

You are never importing from his library. Batch loss filtering is not a part of the fastai library. It’s a part of his

linminhtoo · July 1, 2020, 10:49am

I copied his source code from the github repo. But you have a point, I might have missed out something when I did so. I will double check now. Thank you!

linminhtoo · July 1, 2020, 8:35pm

Hey Zachary, just wanted to update that I had managed to get it all working finally. (and the following may be helpful for those who stumble into this in the future) I realised that the errors were coming because I copied the code into my notebook directly (strange, because this was one of the possible options stated in one of the forum posts on which I saw the repo being talked about). Somehow, this was causing bugs in the code that I had to patiently solve, which I mostly did in the end.

However, my friend and teammate had found a much easier way, which is just to do !git clone https://github.com/oguiza/fastai_extensions.git and then from fastai_extensions.fastai_extensions import * Stupid me! I did not have to use Learner a swell, the plain old cnn_learner and my original code works fine. Thankfully, I can finally try those awesome augmentations… There seems to be a minor bug though, when I am using .blend(**kwargs). When I use learn.lr_finder(), sometimes the loss gets stuck at ~0.0014 for a few seconds and the lr_finder() exits unexpectedly, giving me a blank graph. I know that there is nothing wrong with my code because I had already run 2 epochs and saved my weights. Just pressing learn.lr_find() again seems to fix the ‘bug’ (sometimes a third try is required). Note that I am doing all this on GoogleColab.