ImageNette/Woof Leaderboards - guidelines for proving new high scores?

Seb · August 20, 2019, 5:19pm

I finally re-figured out what the issue with current baseline is. I will detail it here, and I think you will see why I’ve been telling people to be cautious when comparing new ideas to the leaderboard.

Baseline runs code in train_imagenette.py [1].

Check this part of the code:

bs_rat = bs/256
if gpu is not None: bs_rat = num_distrib()
if not gpu: print(f’lr: {lr}; eff_lr: {lrbs_rat}; size: {size}; alpha: {alpha}; mom: {mom}; eps: {eps}')
lr *= bs_rat

When I run train_imagenette on 1 GPU, with bs = 64, my learning rate gets divided by 4! My understanding is that, with 4 GPUs, the learning rate stays the same but we would want to increase it.
I think this is a relic of having a hardcoded bs of 256 with train_imagenet.py [2] …

Let’s compare some results between using intended lr/4 and intended lr:

Dataset	Epochs	Size	Accuracy	Params	GPUs
Imagenette	5	128	85.36% [4]	%run train_imagenette.py --epochs 5 --bs 64 --lr 12e-3 --mixup 0	1
Imagenette	5	128	82.9% [3]	%run train_imagenette.py --epochs 5 --bs 64 --lr 3e-3 --mixup 0	1

First line has a learning rate of 12e-3 but an effective lr of 3e-3. Second line: lr=3e-3, eff lr = 0.00075.

[1] https://github.com/fastai/fastai/blob/master/examples/train_imagenette.py
[2] https://github.com/fastai/fastai/blob/master/examples/train_imagenet.py
[3] np.mean([83.8,83.8,81.8,82.4,81.8,85,85,80.4,83,82])
[4] np.mean([86,85.4,84.4,85.2,84.8,85,85.6,85.4,85.4,86.4])