I am currently working on the problem of finding optimal hyperparams for data augmentation. More concretely, the max till which an image can be translated, rotated, sheared etc to squeeze the maximum accuracy out of the model. Can anyone point to some papers/blog posts that address this problem?
I haven’t seen any - it would be a great topic to write about, I think. I find that trying different augmentation amounts with TTA is a good way to find what works best.
Do you mean that you use TTA to find good augmentation transformations (and their hyperparameters) and then you add these to data augmentation during training?
I am sorry. I am not aware of the meaning of TTA. Or maybe it is something I forgot.
Watch Lesson 2 til the end
Ohh okay. I missed the lesson yesterday because I had my exam today
@jeremy : This is a very important problem for the industry, right? And I believe this is indeed something that can make a big impact. In medicine for example, increase in accuracy of a model by even 0.1% can save lives. Similarly, in fraud this can translate to savings worth many millions of dollars.
From what I have observed, many problems that are quite relevant to the industry are almost completely ignored by academic research. Another example is the problem of handling imbalanced datasets, which I assume is very common in the industry. I haven’t seen much published literature on that either.
Why do you think that so many of the important industry problems are ignored by the academia? I am asking this because I am applying for PhD programs this year. And I want to keep the prime focus of my PhD on solving such important problems.
I think it’s a combination to a misalignment of incentives (academics incentives are citations, not industry impact) and a lack of understanding of real-world issues in industry. It’s a big problem.
have you seen this one? Smart Augmentation Learning an Optimal Data Augmentation Strategy
i haven’t studied it yet - just stumbled while researching on a general subject of data augmentation using GANs
Looks interesting! I’ve bookmarked it to read later.
and this paper from cs231n The Effectiveness of Data Augmentation in Image Classification using Deep Learning seems to deal with the subject of this thread.
in fact i see a lot of recent papers dealing with data augmentation one way or the other
i spend my non-sleeping time on thinking about and tweaking the code for generating synthetic data for training (during my, albeit so short, sleeping hours the GPU is doing the training, just in time with freshly baked results ready for the morning coffee) and the more i dig into those the more it seems like the training on synthetic data has its own rules - ex. it hates monotony - change the parameters once in a while to twist the data - you get the spike in accuracy on the validation set. And because it’s so cheap to generate data for training you actually can tune it for your specific use case - no need to be too generic…
These are important insights @helena, and not well studied or documented in the literature. If you keep changing things around in your training, it really helps the generalization of the model.
I’ve stumbled upon another recent paper about learning data augmentation
Authors call their neural network DAGAN. They train a GAN, to train a generator, which takes as an input an image from a particular class and random noise, and generates similar images for this class. However, as I understand from their results, the improvements are most significant when there are few images in a dataset.
It seems to me that the idea has a lot of potential, and it’s almost unexplored.
+1 on implmeneting this!