Imagenet training project discussion

sjcho · April 7, 2018, 11:00pm

There seems to be multiple releases of imagenet dataset.
Is “ImageNet Fall 2011 release” right dataset for object classification benchmark?

Does anyone know ballpark range of training time for imagenet using single 1080Ti?

And I have an idea to I want to try.
During Kaggle CDiscount challege(also had large dataset - 15 million images with 180x180 res), I tried a following method to speed up the training:

1. Train normally for a few epoch.
2. Freeze a first one (or two) layer group and train. 
       - In my case, I could double the batch size and the training time for each epoch was halved.
3. Unfreeze all layers time to time. 
        (Or it might be better if gradients from unfrozen layers are accumulated and backproped to froze layers time to time?)  
4. Repeat 2-3

The idea is that first/second layer groups take up quite memory/computation, but weight values change very little after some epochs(and especially with pretrained weights), so they don’t need to be updated every weights for all batches.
Since I kept change training parameters during competition, I don’t know whether this actually worked in terms of training time or accuracy.

I would like to confirm this approach while applying all fancy fast.ai features as time allows.
Actually I don’t have much background in ML/DL, so let me know if this approach doesn’t make much sense!

BTW, Accurate, Large Minibatch SGD:
Training ImageNet in 1 Hour paper from FB mentions ‘gradual warmup’ of learning rate. And interestingly it looks like CLR was doing similar already.

(Edit: I used Xception in the competition since it took almost half time to train than inception4, inception resnet though it had slight less accuracy. It may be just because I used a bad implementation, but it took forever to train DPN)