Papers on finding optimal data augmentation hyperparams?

singlasahil14 · November 8, 2017, 9:52am

I am currently working on the problem of finding optimal hyperparams for data augmentation. More concretely, the max till which an image can be translated, rotated, sheared etc to squeeze the maximum accuracy out of the model. Can anyone point to some papers/blog posts that address this problem?

jeremy · November 8, 2017, 1:14pm

I haven’t seen any - it would be a great topic to write about, I think. I find that trying different augmentation amounts with TTA is a good way to find what works best.

zpnc · November 8, 2017, 2:23pm

Do you mean that you use TTA to find good augmentation transformations (and their hyperparameters) and then you add these to data augmentation during training?

jeremy · November 8, 2017, 2:41pm

Yes exactly

singlasahil14 · November 8, 2017, 2:56pm

I am sorry. I am not aware of the meaning of TTA. Or maybe it is something I forgot.

DavideBoschetto · November 8, 2017, 3:01pm

Watch Lesson 2 til the end

singlasahil14 · November 8, 2017, 3:04pm

Ohh okay. I missed the lesson yesterday because I had my exam today

singlasahil14 · November 9, 2017, 1:35am

@jeremy : This is a very important problem for the industry, right? And I believe this is indeed something that can make a big impact. In medicine for example, increase in accuracy of a model by even 0.1% can save lives. Similarly, in fraud this can translate to savings worth many millions of dollars.
From what I have observed, many problems that are quite relevant to the industry are almost completely ignored by academic research. Another example is the problem of handling imbalanced datasets, which I assume is very common in the industry. I haven’t seen much published literature on that either.
Why do you think that so many of the important industry problems are ignored by the academia? I am asking this because I am applying for PhD programs this year. And I want to keep the prime focus of my PhD on solving such important problems.

jeremy · November 9, 2017, 2:14am

I think it’s a combination to a misalignment of incentives (academics incentives are citations, not industry impact) and a lack of understanding of real-world issues in industry. It’s a big problem.

helena · November 15, 2017, 12:31am

have you seen this one? Smart Augmentation Learning an Optimal Data Augmentation Strategy

i haven’t studied it yet - just stumbled while researching on a general subject of data augmentation using GANs

jeremy · November 15, 2017, 4:33am

Looks interesting! I’ve bookmarked it to read later.

helena · November 16, 2017, 12:17pm

and this paper from cs231n The Effectiveness of Data Augmentation in Image Classification using Deep Learning seems to deal with the subject of this thread.
in fact i see a lot of recent papers dealing with data augmentation one way or the other

helena · December 7, 2017, 3:43am

i spend my non-sleeping time on thinking about and tweaking the code for generating synthetic data for training (during my, albeit so short, sleeping hours the GPU is doing the training, just in time with freshly baked results ready for the morning coffee) and the more i dig into those the more it seems like the training on synthetic data has its own rules - ex. it hates monotony - change the parameters once in a while to twist the data - you get the spike in accuracy on the validation set. And because it’s so cheap to generate data for training you actually can tune it for your specific use case - no need to be too generic…

jeremy · December 7, 2017, 2:26pm

These are important insights @helena, and not well studied or documented in the literature. If you keep changing things around in your training, it really helps the generalization of the model.

Tsepaka · January 23, 2018, 10:36pm

Hi,

I’ve stumbled upon another recent paper about learning data augmentation

Authors call their neural network DAGAN. They train a GAN, to train a generator, which takes as an input an image from a particular class and random noise, and generates similar images for this class. However, as I understand from their results, the improvements are most significant when there are few images in a dataset.

It seems to me that the idea has a lot of potential, and it’s almost unexplored.

rors101 · February 4, 2020, 6:59am

+1 on implmeneting this!