Mixup data augmentation

Here is my fastai V1 implementation and a small demo notebook.

You can expect manifold mixup and output mixup to be slower and consume a bit more memory (due to needing two forward pass)(not anymore thanks to @MicPie) but they let you use a larger learning rate and might be overall worth it.

On the tiny benchmark manifold mixup is not sensibly better than input mixup (but the author says that the benefit appear after a large number of epochs) but I observed nice speed ups to the convergence with output mixup. Now we need to validate that on a larger dataset.

I will test it on a private dataset this week but I would be happy to get an outside benchmark comparing no mixup, input mixup, manifold mixup and output mixup (@LessW2020 ?).

4 Likes