ImageNette/Woof Leaderboards - guidelines for proving new high scores?

I finally re-figured out what the issue with current baseline is. I will detail it here, and I think you will see why I’ve been telling people to be cautious when comparing new ideas to the leaderboard.

Baseline runs code in train_imagenette.py [1].

Check this part of the code:

bs_rat = bs/256
if gpu is not None: bs_rat = num_distrib()
if not gpu: print(f’lr: {lr}; eff_lr: {lr
bs_rat}; size: {size}; alpha: {alpha}; mom: {mom}; eps: {eps}')
lr *= bs_rat

When I run train_imagenette on 1 GPU, with bs = 64, my learning rate gets divided by 4! My understanding is that, with 4 GPUs, the learning rate stays the same but we would want to increase it.
I think this is a relic of having a hardcoded bs of 256 with train_imagenet.py [2] …

Let’s compare some results between using intended lr/4 and intended lr:

Dataset Epochs Size Accuracy Params GPUs
Imagenette 5 128 85.36% [4] %run train_imagenette.py --epochs 5 --bs 64 --lr 12e-3 --mixup 0 1
Imagenette 5 128 82.9% [3] %run train_imagenette.py --epochs 5 --bs 64 --lr 3e-3 --mixup 0 1

First line has a learning rate of 12e-3 but an effective lr of 3e-3. Second line: lr=3e-3, eff lr = 0.00075.

[1] https://github.com/fastai/fastai/blob/master/examples/train_imagenette.py
[2] https://github.com/fastai/fastai/blob/master/examples/train_imagenet.py
[3] np.mean([83.8,83.8,81.8,82.4,81.8,85,85,80.4,83,82])
[4] np.mean([86,85.4,84.4,85.2,84.8,85,85.6,85.4,85.4,86.4])

3 Likes