You need to install wwf as well. (pip install wwf). Should probably make that more clear in the article
Yes sir, you can just include wwf in the requirements. Thanks a lot
One more update, wwf and timm doesn’t have pre-trained weights for efficientnet b4 to efficientnet_b8, efficientnet_l2, efficientnet_el and many more. So you might get a warning something like
Pretrained model URL is invalid, using random initialization.
So these models should be chosen only when model training is to be done from beginning, not just fine_tune
last fc
layers.
First time efficientnet user here
Looking at timm
there are different res’s and Crop?
I presume this means I need to adjust tfms
to match this?
Could someone provide/explain an example with efficientnet_b3
and one with crop != 1.0?
@avenio timm covers all of the efficientnet variants that have been released, there isn’t another PyTorch impl that does, there are weights for all of them too, but only for the weights ported from Tensorflow as those were trained by google with extensive TPU resources. They are all prefixed by tf_
the models without the tf_ prefix were trained by myself in PyTorch with the example training scripts.
There is a difference between the two models, the tf_ ones use padding code that requires access to the input size to reproduce Tensorflow’s SAME padding and maintain accuracy of the models. This adds a slight overhead and can make exporting to ONNX a bit more complicated (still doable). Other PyTorch versions of EfficientNet that use the TF weights only have this mode. The non tf models in my impl use PyTorch style padding.
Re @ModdingLeo, all of the models use different crops as per Tensorflow impl. I threw in some extra crop options for a few of my models that were trained with heavy augs as they improved performance at larger crops, which doesn’t usually happen with normal aug.
Please correct me if I’m wrong, as I’m just learning
So according to you, this piece of code
import timm
from pprint import pprint
model_names = timm.list_models('tf_*')
pprint(model_names)
will print all those models which uses Tensorflow padding and are a bit complicated to export to ONNX. Only the early efficientnet models (like efficientnet_3a) have been trained by you in Pytorch, so they are available with the pre-trained weights, and are easily exported to ONNX, as they use PyTorch style padding.
Also, thanks a lot for the timm
library that has made our lives so easy, and sorry for not being fully aware of the models present in the library.
I hope you clear my doubt.
For listing, you can set a flag to only show the models with pretrained weights… ie for anything with efficientnet in the name that has pretrained weights…
timm.list_models('*efficientnet*', pretrained=True)
Hi, in a recent digital pathology article there was a shufflenet used.
I was wondering whether there is a way to use such a shufflenet e.g. shufflenet v2 with fastai as well?
I also use timm for integrating architectures which are not directly available in fastai, but unfortunately there is no shufflenet v2 integrated yet.
I loaded the pretrained pytorch model, but I don’t know how to make it usable as other models with the timm library for example.
shufflenet_v2 = torch.hub.load('pytorch/vision:v0.6.0', 'shufflenet_v2_x1_0', pretrained=True)
Any ideas or recommendations how to integrate shuflenet v2 or a pretained pytorch model in fastai in general?
Hi, thanks for your great work!
Is it ok if I ask regarding a model from the timm package?
I was wondering what the difference is between efficientnet_b3 and efficientnet_b3a?
Is the input size the only/main difference? What does that mean in practice? - Does that mean if I wanted to use slightly taller images, then I should use efficientnet-b3a instead of b3? (sorry I am still quite a beginner)
@register_model
def efficientnet_b3(pretrained=False, **kwargs):
""" EfficientNet-B3 """
# NOTE for train, drop_rate should be 0.3, drop_path_rate should be 0.2
model = _gen_efficientnet(
'efficientnet_b3', channel_multiplier=1.2, depth_multiplier=1.4, pretrained=pretrained, **kwargs)
return model
@register_model
def efficientnet_b3a(pretrained=False, **kwargs):
""" EfficientNet-B3 @ 320x320 w/ 1.0 test crop-pct """
# NOTE for train, drop_rate should be 0.3, drop_path_rate should be 0.2
model = _gen_efficientnet(
'efficientnet_b3a', channel_multiplier=1.2, depth_multiplier=1.4, pretrained=pretrained, **kwargs)
return model
For reference: I want to use patches from pathology images with sizes between 128x128, 256x256 and up to 512x512 pixels.
And both models use the same updated weights in version 0.3.2 from August I guess?
Aug 12, 2020
- New/updated weights from training experiments
- EfficientNet-B3 - 82.1 top-1 (vs 81.6 for official with AA and 81.9 for AdvProp)
Kind regards!
TLDR, for you use either will be fine since you are fine tuning with a different resolution.
Same url for the weights, so exact same with the only difference being the input size and validation crop factor specified in cfg. Each model entry point function is tied to one pretrained weight with associated default_cfg that’s used for setting up the data pipeline right now. I created a variation in the name since the B3 I trained showed better scaling than most and I wanted to illustrate that without changing the standard cfg. It’s more common to see models validation performance tail off quickly if you move away from the 0.875-.9 crop on ImageNet and away from the models native training resolution.
thank you!
The efficientnet-b2a performance I am using is great.
However, I am a little confused:
Resnet18 has about 11 Mio. Params, Efficientnet-b2a 9 Mio Params.
Nonetheless, the max batch size I can use with an image size of 512*512 pixels is for the resnet18 130 and for the efficientnet-b2a only 55.
Am I doing something wroing?
I used the implementation from muellerzr.
Kind regards
Nope, that’s the way it is. EfficientNet and other models in the InvertedResidual/MBConv family make heavy use of depthwise separable convs. EfficientNet also uses SiLU (Swish) activations. You get an increase in parameter efficiency (performance per parameter) but on current GPU hardware + software stacks you won’t really see a corresponding increase in throughput or decrease in memory usage for the performance level of your network.
I was wondering @rwightman if you might expand on this regarding padding? I want to use tf_efficientnet_l2_ns but hesitating since I’m not sure that I fully understand the issues.
Also, you mention using different crops and options in your training. Would you recommend we utilize what you do? I have been experimenting with various different augmentations using resnets but this post made me wonder.
I’m using colab and trying to pip install wwf. However I’m getting this error:
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: ‘/tmp/pip-install-80wjy09w/wwf/wwf-0.0.10.dist-info/WHEEL’
How should I resolve this? Thanks
Edit: looks like it’s related to the newes version updated an hour ago. I downgraded and it works
Thanks for making me aware of this, will look into what happened
For whatever reason it failed on the last release, should be working now @nlhnwng
AttributeError: ‘SqueezeExcite’ object has no attribute ‘gate’[BUG] Issue title…
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-24-50428ae238c4> in <module>()
----> 1 learner10JUNE_73Cls_effb7_ns.predict('/content/drive/MyDrive/ColabWork/ClientsWork/Julian/Datasets/Gradio-OnlineTest-Images/Cyprinus_carpio.jpeg')
20 frames
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in predict(self, item, rm_type_tfms, with_input)
264 def predict(self, item, rm_type_tfms=None, with_input=False):
265 dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms, num_workers=0)
--> 266 inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
267 i = getattr(self.dls, 'n_inp', -1)
268 inp = (inp,) if i==1 else tuplify(inp)
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, cbs, **kwargs)
251 if with_loss: ctx_mgrs.append(self.loss_not_reduced())
252 with ContextManagers(ctx_mgrs):
--> 253 self._do_epoch_validate(dl=dl)
254 if act is None: act = getattr(self.loss_func, 'activation', noop)
255 res = cb.all_tensors()
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _do_epoch_validate(self, ds_idx, dl)
201 if dl is None: dl = self.dls[ds_idx]
202 self.dl = dl
--> 203 with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
204
205 def _do_epoch(self):
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
161
162 def _with_events(self, f, event_type, ex, final=noop):
--> 163 try: self(f'before_{event_type}'); f()
164 except ex: self(f'after_cancel_{event_type}')
165 self(f'after_{event_type}'); final()
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in all_batches(self)
167 def all_batches(self):
168 self.n_iter = len(self.dl)
--> 169 for o in enumerate(self.dl): self.one_batch(*o)
170
171 def _do_one_batch(self):
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in one_batch(self, i, b)
192 b = self._set_device(b)
193 self._split(b)
--> 194 self._with_events(self._do_one_batch, 'batch', CancelBatchException)
195
196 def _do_epoch_train(self):
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
161
162 def _with_events(self, f, event_type, ex, final=noop):
--> 163 try: self(f'before_{event_type}'); f()
164 except ex: self(f'after_cancel_{event_type}')
165 self(f'after_{event_type}'); final()
/usr/local/lib/python3.7/dist-packages/fastai/learner.py in _do_one_batch(self)
170
171 def _do_one_batch(self):
--> 172 self.pred = self.model(*self.xb)
173 self('after_pred')
174 if len(self.yb):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
137 def forward(self, input):
138 for module in self:
--> 139 input = module(input)
140 return input
141
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
137 def forward(self, input):
138 for module in self:
--> 139 input = module(input)
140 return input
141
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
137 def forward(self, input):
138 for module in self:
--> 139 input = module(input)
140 return input
141
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py in forward(self, input)
137 def forward(self, input):
138 for module in self:
--> 139 input = module(input)
140 return input
141
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/timm/models/efficientnet_blocks.py in forward(self, x)
120 x = self.act1(x)
121
--> 122 x = self.se(x)
123
124 x = self.conv_pw(x)
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/timm/models/efficientnet_blocks.py in forward(self, x)
45 x_se = self.act1(x_se)
46 x_se = self.conv_expand(x_se)
---> 47 return x * self.gate(x_se)
48
49
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
1129 return modules[name]
1130 raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1131 type(self).__name__, name))
1132
1133 def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:
AttributeError: 'SqueezeExcite' object has no attribute 'gate'
Do anybody knows, why I am getting this weird error while making predictions with the fastai trained(with the help of wwf, and timm)learner.
Thanks
Thanks…
I am getting the same BUG with you. If you solve this problem now. Please
tell me how to do it. Thanks