Article or paper regarding the size increase during training (lesson 3)

YuriGalindo · June 1, 2018, 8:23pm

In the satellite images problem in lesson 3, Jeremy starts off training on size 64 images, then on size 128 and finally on size 256. I don’t understand the theory behind this, is there a source or explanation for why this method works?

YuriGalindo · June 14, 2018, 4:55pm

The VGG paper (https://arxiv.org/pdf/1409.1556v6.pdf) uses multi scale training, randomly sampling between sz=256 and 512. This leads to a performance improvement. Was the closest thing I found.

Delving deeper into rectifiers (https://arxiv.org/pdf/1502.01852.pdf) applies the same technique, also showing performance gain.