I’ve been reading and testing the mixup model this evening but are a bit confused. In the example in the fastai documentation it trains a model with and without mixup and compares the result. Using mixup seems to make the loss larger and accuracy lower for the same number of epocs. It does the same for my image-dataset. But the paper shows otherwise. Do I have to change other regularisations like lowering dropout and weight decay to get the benefit?