Adding EfficientNet to fastai vision

muellerzr · October 13, 2020, 9:25am

You need to install wwf as well. (pip install wwf). Should probably make that more clear in the article

avenio · October 13, 2020, 9:38am

Yes sir, you can just include wwf in the requirements. Thanks a lot

avenio · October 13, 2020, 9:49am

One more update, wwf and timm doesn’t have pre-trained weights for efficientnet b4 to efficientnet_b8, efficientnet_l2, efficientnet_el and many more. So you might get a warning something like

Pretrained model URL is invalid, using random initialization.

So these models should be chosen only when model training is to be done from beginning, not just fine_tune last fc layers.

ModdingLeo · October 17, 2020, 2:33pm

First time efficientnet user here

Looking at timm there are different res’s and Crop?

I presume this means I need to adjust tfms to match this?

Could someone provide/explain an example with efficientnet_b3 and one with crop != 1.0?

rwightman · October 18, 2020, 6:08am

@avenio timm covers all of the efficientnet variants that have been released, there isn’t another PyTorch impl that does, there are weights for all of them too, but only for the weights ported from Tensorflow as those were trained by google with extensive TPU resources. They are all prefixed by tf_ the models without the tf_ prefix were trained by myself in PyTorch with the example training scripts.

There is a difference between the two models, the tf_ ones use padding code that requires access to the input size to reproduce Tensorflow’s SAME padding and maintain accuracy of the models. This adds a slight overhead and can make exporting to ONNX a bit more complicated (still doable). Other PyTorch versions of EfficientNet that use the TF weights only have this mode. The non tf models in my impl use PyTorch style padding.

Re @ModdingLeo, all of the models use different crops as per Tensorflow impl. I threw in some extra crop options for a few of my models that were trained with heavy augs as they improved performance at larger crops, which doesn’t usually happen with normal aug.

avenio · October 19, 2020, 1:26pm

Please correct me if I’m wrong, as I’m just learning

So according to you, this piece of code

    import timm

    from pprint import pprint

    model_names = timm.list_models('tf_*')

    pprint(model_names)

will print all those models which uses Tensorflow padding and are a bit complicated to export to ONNX. Only the early efficientnet models (like efficientnet_3a) have been trained by you in Pytorch, so they are available with the pre-trained weights, and are easily exported to ONNX, as they use PyTorch style padding.

Also, thanks a lot for the timm library that has made our lives so easy, and sorry for not being fully aware of the models present in the library.
I hope you clear my doubt.

rwightman · October 19, 2020, 4:35pm

For listing, you can set a flag to only show the models with pretrained weights… ie for anything with efficientnet in the name that has pretrained weights…

timm.list_models('*efficientnet*', pretrained=True)

sophia · December 8, 2020, 9:37pm

Hi, in a recent digital pathology article there was a shufflenet used.

I was wondering whether there is a way to use such a shufflenet e.g. shufflenet v2 with fastai as well?

I also use timm for integrating architectures which are not directly available in fastai, but unfortunately there is no shufflenet v2 integrated yet.

I loaded the pretrained pytorch model, but I don’t know how to make it usable as other models with the timm library for example.

shufflenet_v2 = torch.hub.load('pytorch/vision:v0.6.0', 'shufflenet_v2_x1_0', pretrained=True)

Any ideas or recommendations how to integrate shuflenet v2 or a pretained pytorch model in fastai in general?

sophia · December 8, 2020, 10:55pm

Hi, thanks for your great work!

Is it ok if I ask regarding a model from the timm package?

I was wondering what the difference is between efficientnet_b3 and efficientnet_b3a?
Is the input size the only/main difference? What does that mean in practice? - Does that mean if I wanted to use slightly taller images, then I should use efficientnet-b3a instead of b3? (sorry I am still quite a beginner)

@register_model
def efficientnet_b3(pretrained=False, **kwargs):
    """ EfficientNet-B3 """
    # NOTE for train, drop_rate should be 0.3, drop_path_rate should be 0.2
    model = _gen_efficientnet(
        'efficientnet_b3', channel_multiplier=1.2, depth_multiplier=1.4, pretrained=pretrained, **kwargs)
    return model


@register_model
def efficientnet_b3a(pretrained=False, **kwargs):
    """ EfficientNet-B3 @ 320x320 w/ 1.0 test crop-pct """
    # NOTE for train, drop_rate should be 0.3, drop_path_rate should be 0.2
    model = _gen_efficientnet(
        'efficientnet_b3a', channel_multiplier=1.2, depth_multiplier=1.4, pretrained=pretrained, **kwargs)
    return model

For reference: I want to use patches from pathology images with sizes between 128x128, 256x256 and up to 512x512 pixels.

And both models use the same updated weights in version 0.3.2 from August I guess?

Aug 12, 2020

New/updated weights from training experiments
- EfficientNet-B3 - 82.1 top-1 (vs 81.6 for official with AA and 81.9 for AdvProp)

Kind regards!

rwightman · December 8, 2020, 11:59pm

TLDR, for you use either will be fine since you are fine tuning with a different resolution.

Same url for the weights, so exact same with the only difference being the input size and validation crop factor specified in cfg. Each model entry point function is tied to one pretrained weight with associated default_cfg that’s used for setting up the data pipeline right now. I created a variation in the name since the B3 I trained showed better scaling than most and I wanted to illustrate that without changing the standard cfg. It’s more common to see models validation performance tail off quickly if you move away from the 0.875-.9 crop on ImageNet and away from the models native training resolution.

sophia · December 13, 2020, 12:05pm

thank you!

sophia · December 15, 2020, 1:29pm

The efficientnet-b2a performance I am using is great.
However, I am a little confused:
Resnet18 has about 11 Mio. Params, Efficientnet-b2a 9 Mio Params.

Nonetheless, the max batch size I can use with an image size of 512*512 pixels is for the resnet18 130 and for the efficientnet-b2a only 55.

Am I doing something wroing?

I used the implementation from muellerzr.

Kind regards

rwightman · December 15, 2020, 8:25pm

Nope, that’s the way it is. EfficientNet and other models in the InvertedResidual/MBConv family make heavy use of depthwise separable convs. EfficientNet also uses SiLU (Swish) activations. You get an increase in parameter efficiency (performance per parameter) but on current GPU hardware + software stacks you won’t really see a corresponding increase in throughput or decrease in memory usage for the performance level of your network.

charliec · January 29, 2021, 1:06pm

I was wondering @rwightman if you might expand on this regarding padding? I want to use tf_efficientnet_l2_ns but hesitating since I’m not sure that I fully understand the issues.

Also, you mention using different crops and options in your training. Would you recommend we utilize what you do? I have been experimenting with various different augmentations using resnets but this post made me wonder.

nlhnwng · February 18, 2021, 9:49pm

I’m using colab and trying to pip install wwf. However I’m getting this error:
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: ‘/tmp/pip-install-80wjy09w/wwf/wwf-0.0.10.dist-info/WHEEL’
How should I resolve this? Thanks

Edit: looks like it’s related to the newes version updated an hour ago. I downgraded and it works

muellerzr · February 18, 2021, 10:07pm

Thanks for making me aware of this, will look into what happened

For whatever reason it failed on the last release, should be working now @nlhnwng

MuhammadAli · July 9, 2021, 5:09am

AttributeError: ‘SqueezeExcite’ object has no attribute ‘gate’[BUG] Issue title…

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-24-50428ae238c4> in <module>()
----> 1 learner10JUNE_73Cls_effb7_ns.predict('/content/drive/MyDrive/ColabWork/ClientsWork/Julian/Datasets/Gradio-OnlineTest-Images/Cyprinus_carpio.jpeg')

20 frames
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in predict(self, item, rm_type_tfms, with_input)
    264     def predict(self, item, rm_type_tfms=None, with_input=False):
    265         dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms, num_workers=0)
--> 266         inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
    267         i = getattr(self.dls, 'n_inp', -1)
    268         inp = (inp,) if i==1 else tuplify(inp)

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, cbs, **kwargs)
    251         if with_loss: ctx_mgrs.append(self.loss_not_reduced())
    252         with ContextManagers(ctx_mgrs):
--> 253             self._do_epoch_validate(dl=dl)
    254             if act is None: act = getattr(self.loss_func, 'activation', noop)
    255             res = cb.all_tensors()

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _do_epoch_validate(self, ds_idx, dl)
    201         if dl is None: dl = self.dls[ds_idx]
    202         self.dl = dl
--> 203         with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
    204 
    205     def _do_epoch(self):

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    161 
    162     def _with_events(self, f, event_type, ex, final=noop):
--> 163         try: self(f'before_{event_type}');  f()
    164         except ex: self(f'after_cancel_{event_type}')
    165         self(f'after_{event_type}');  final()

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in all_batches(self)
    167     def all_batches(self):
    168         self.n_iter = len(self.dl)
--> 169         for o in enumerate(self.dl): self.one_batch(*o)
    170 
    171     def _do_one_batch(self):

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in one_batch(self, i, b)
    192         b = self._set_device(b)
    193         self._split(b)
--> 194         self._with_events(self._do_one_batch, 'batch', CancelBatchException)
    195 
    196     def _do_epoch_train(self):

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    161 
    162     def _with_events(self, f, event_type, ex, final=noop):
--> 163         try: self(f'before_{event_type}');  f()
    164         except ex: self(f'after_cancel_{event_type}')
    165         self(f'after_{event_type}');  final()

/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _do_one_batch(self)
    170 
    171     def _do_one_batch(self):
--> 172         self.pred = self.model(*self.xb)
    173         self('after_pred')
    174         if len(self.yb):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/timm/models/efficientnet_blocks.py in forward(self, x)
    120         x = self.act1(x)
    121 
--> 122         x = self.se(x)
    123 
    124         x = self.conv_pw(x)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/timm/models/efficientnet_blocks.py in forward(self, x)
     45         x_se = self.act1(x_se)
     46         x_se = self.conv_expand(x_se)
---> 47         return x * self.gate(x_se)
     48 
     49 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1129                 return modules[name]
   1130         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1131             type(self).__name__, name))
   1132 
   1133     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'SqueezeExcite' object has no attribute 'gate'

Do anybody knows, why I am getting this weird error while making predictions with the fastai trained(with the help of wwf, and timm)learner.
Thanks
Thanks…

DeathSprout · March 10, 2022, 7:06am

I am getting the same BUG with you. If you solve this problem now. Please
tell me how to do it. Thanks