Add support for fitting on VGG-16?

apil.tamang · November 6, 2017, 4:28pm

Imagine my surprise when I tried to pass in a VGG 16 for the model, and couldn’t find it supported out of the box! I thought VGG was a staple for any kind of conv-net experiments, and many times the first go-to pretrained model.

I wanted to explore the addition of VGG16 for the fastai lib. Maybe for not all possible scenarios, but just for the basic fitting part. Also thought it’d give me a chance to work with the code-base and be a little familiar with it.

Would you have any objections? Also, what are the major impediments you can think of right away in trying to do that?

Thanks…

jeremy · November 6, 2017, 4:40pm

No objections! It should work fine - may need some tweaks, but will be a good opportunity to dig into the fastai lib for you. Check out how I define things like dn121 to see what kind of tweaks can be needed.

(The library always adds global pooling after the conv layers BTW, so you won’t exactly end up with VGG, but a fully convolutional equivalent).

yinterian · November 6, 2017, 8:29pm

I am happy to help if you have questions.

Even · November 7, 2017, 12:23am

My guess is that it has something to do with this image:

(from https://culurciello.github.io/tech/2016/06/20/training-enet.html)

Accuracy is on the vertical access, number of operations is on the horizontal, and the size of the model is represented in the size of the point. As you can see, VGG is larger than Resnet and Inception by a pretty large margin in both model size and number of operations and isn’t as accurate.

Which isn’t to say you shouldn’t build it and add it it. It’s an interesting architecture and relatively straightforward to build yourself, and you’ll learn a lot by doing so.

apil.tamang · November 7, 2017, 2:12am

@Even @yinterian

Working on it straight 8 hours, and still can’t get it to work… It’s been an interesting challenge though! Definitely forcing me to carefully step through every line of the code, and more importantly, for the first time come-up a good debugging workflow for python … and so on and so forth…

Main reason VGG16 is genuinely so important is because many architectures for the problem of image segmentation, multi-label classificatin etc. seem to prefer VGG16 as the base feature extractor. Hence, VGG is definitely not obsolete by any standards. Besides, in other problems, I’ve occassionally found VGG16 to just keep giving me that extra edge on loss and accuracy.

But despite all that… the challenge of this integration has been one helluva ride… let’s see. Feels so close, yet so far. Thanks for offering the help though.

Update:
first trial on pretrained VGG16 gives an accuracy of 98.5%… not bad, if you ask me.

jeremy · November 8, 2017, 6:00am

Thanks for the commit @apil.tamang! I ended up doing it a little differently - you can see my version here: https://github.com/fastai/fastai/commit/1c08462982445b963bbb4f2f4f77fd7f20b23d9d . As you see, I decided to use the batchnorm version, since it’s quite a bit better.

Much to my surprise, I’m getting 99.4% accuracy without TTA! Because I’m treating it as fully convolutional, and doing some tricks in the fastai library with pooling, it’s not really VGG any more - only the conv layers are VGG. With the original VGG last year on Keras I was getting 98.3% IIRC.