In this (https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-and.html) paper published by Google, the authors proposed a new neural network architecture they call “EfficientNet”. EfficientNets are a family of image classification models, which achieve state-of-the-art accuracy, yet being an order-of-magnitude smaller and faster than previous models.
From the pytorch implementation of EfficientNet:
“EfficientNet PyTorch is a PyTorch re-implementation of EfficientNet. It is consistent with the original TensorFlow implementation(https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet), such that it is easy to load weights from a TensorFlow checkpoint. At the same time, we aim to make our PyTorch implementation as simple, flexible, and extensible as possible.”
I will make a pull request to https://github.com/fastai/fastai/tree/master/fastai/vision/models. Is this the right repo to make a pull request? In this request, I’ll include in the util and model script of the pytorch implementation. Is there anything else I have to do?
Glad to see that!
I’ve installed the fastai dev version by pip install git+https://github.com/fastai/fastai.git
and the Efficientnet pip install efficientnet-pytorch.
How can I use Efficientnet like learn = cnn_learner(data, models.resnet34, metrics=error_rate)?
I tried learn = Learner(data, models.efficientnet.EfficientNetB1(), metrics=error_rate), it raised “NameError: name ‘data’ is not defined”.
While learn = Learner(data, models.efficientnet.EfficientNetB1, metrics=error_rate), it raised "AttributeError: ‘function’ object has no attribute ‘to’ ".
Seems silly questions… Could you tell me how to use it? Thanks again.
I have the same problem it raised “NameError: name ‘data’ is not defined”. data is defined and does not make a difference if I use:
learn = Learner( data=data,arch=models.efficientnet.EfficientNetB5(),
metrics=accuracy)
or if I use
learn = Learner( data,arch=models.efficientnet.EfficientNetB5(),
metrics=accuracy)
Here are the steps I did to get efficient net working @gy0373@agentili
!pip install efficientnet-pytorch
from fastai import *
from fastai.vision import *
from efficientnet_pytorch import EfficientNet
path = untar_data(URLs.PETS)
path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)
pat = r'/([^/]+)_\d+.jpg$'
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=32
).normalize(imagenet_stats)
model = EfficientNet.from_name('efficientnet-b0')
model._fc = nn.Linear(1280, data.c)
learn = Learner(data, model)
I didn’t do this with MNIST as they’re set up for 3 channel inputs whereas MNIST is 2 channel (B/W). Sorry it took so long, did not have a chance to run this briefly until now
Thought is was unnecessary to open a new thread for a quick question.
I expect that different EfficientNets (0-7) have been pretrained targeting different resolutions. I’ve been unable to find that piece of information, though.
Or, perhaps they’ve all been trained targeting the usual imagenet 224/299… But then I think the whole thing about upscaling the resolution would be a lot less effective, if the bulk of the net would have been exposed to low-res features.
It’s strange, the whole point about EfficientNet is that it’s supposed to be, well… Efficient. It has a ludicrously low number of params w.r.t. the network capacity.
Now, I’m training with three tesla V100/32Gb in parallel. All the variants do train very slow (a lot slower than an unfrozen resnet). VRAM occupation is also monsterous. With b7, training with 600px imgs, I cannot raise the BS over 24, since vram occupation is around 90Gb.
Andrea,
What pytorch implementation are you using?
Just for testing, have you tried to test the memory consumption if change the swish activation for relu?
I’ll swap swish and relu, and let you know, thanks. But from your comment, I understand that you are not experiencing such things as you use efficientnet, right?
I meant if you were using rwightman or luke melas’ version.
I tried luke’s and b0 went a bit slower compared to resnet 34 but with much better loss, trying b4 and b5 was somewhat complicated because they were very slow on a t4 and with 512px the bs was very small and i gave up.
If the problem is the swish activation memory consumption you have 3 choices:
Use the manual autograd version that was shown on kaggle (sorry i am on a mobile phone you can google it searching for lower efficient net memory consumption)
Yes, I’ve also found EfficientNet is quite slow in PyTorch. I think that rwightman’s verison is a bit faster but not based on particularly extensive testing. The code looks like it’s written with performance in mind more. Especially with the padding stuff which is a bit weird in Luke’s. On that you might want to ensure your using the fixed image size versions there as they looked better (I think you just had to provide your image size).
I think it might be related to some issues with the depthwise convolutions in PyTorch, I’ve seen various things about performance issues there on the forums/code. You might also want to try PyTorch 1.2 if possible as might be some improvements there (or 1.3 but figured if on 1.1 for some reason then that might be an easier jump).
And yeah, a pretty sizeable memory drop using either the autograd or cuda versions of Swish/Mish (time is one epoch, b0, bs48, 256x256, rwightman’s, Swish).
alloc MB time
Original 6879 01:11
Autograd 5421 01:14
CUDA 5400 01:02
From this notebook which has the autograd version of swish and the little wrapper you need to use swish cuda with rwightman’s (check my fork for the little change to allow specifying an activation function).
Luke Melas. Should I try with rwightman’s? But from what you say below, I don’t thing it’s worth a try.
And this seems to confirm the Efnets are quite resource-demanding. I naively supposed the contrary.
I am trying to using the b7, anyhow. The maximum bs that fits into a V100/32g is 6-7, with 600px imgs. To reach 24, I had to allocate the whole bunch of 4 cards.
Thanks for your tips, I’ll certainly try and look into that stuff. It will be instructive, if nothing else.