Training VGG16 from scratch

msp · July 31, 2017, 8:47pm

I was wondering, has someone tried to train VGG16 from scratch using the original imagenet data (i.e. without pretrained weights)? If so, how many GPUs did you use and how long did it take you? (I am wondering whether to go about training VGG-like models from scratch using my GTX 1080 Ti).

jeremy · July 31, 2017, 10:46pm

We do something similar in part 2 (lesson 9 & 10 IIRC) and have tips for dealing with larger datasets there.

msp · August 1, 2017, 1:54pm

Thanks! Looking forward to it.

I requested access for the imagenet data a few days ago (using an email address with an uncommon domain, as requested on the documentation page) but haven’t heard back yet.

You mentioned here that in your experience access was granted instantly – were you using an email address from an academic institution?

machinethink · August 2, 2017, 8:14am

Note that you can also download the ImageNet data (plus loads of other datasets) from http://academictorrents.com

rashudo · August 2, 2017, 2:48pm

Imagenet is now also available on Kaggle

msp · August 4, 2017, 8:40am

The imagenet data on kaggle is a pre-release for the 2018 challenge (if I understood correctly), however VGG was trained on the data from 2013.

msp · August 4, 2017, 8:43am

Cool link! Although the 2013 training data doesn’t seem to be there.

singleroc · April 12, 2018, 6:40am

@msp

Have you finished to train vgg16 on imagenet? I’m aslo looking into the problem. My loss decreased fast in the beginning unitl 2.0 (the corresponding top1/top5 is about 50%/ 70%). However it would not go down any more by any method. I’m eager someone to share their experience.

msp · April 12, 2018, 7:14am

@singleroc No, eventually my GPU was doing other learning tasks almost around the clock, and I never went back to train VGG16 from scratch. Also, I never got a hold of the 2013 data.

singleroc · April 12, 2018, 7:37am

Thanks for your quick rely. The imagenet dataset seems easily to obtain now. The official website provides link to dataset, also some net disk such as baidu pan and google drive have such copies. You might have a try if you are still interested some day in the future.