Very exciting new paper “Unsupervised Data Augmentation” just out today, if the claims are accurate
Works with all types of data including text and images.
On the IMDb text classification dataset, with only 20 labeled examples, UDA outperforms the state-of-the-art model trained on 25,000 labeled examples. On standard semi-supervised learning benchmarks, CIFAR-10 with 4,000 examples and SVHN with 1,000 examples, UDA outperforms all previous approaches and reduces more than 30% of the error rates of state-of-the-art methods
Just want to add to this one that the concept introduced here of Training Signal Annealing (TSA) is also a really interesting idea to effectively remove the contribution to the gradients of examples in the dataset that the model is already classifying correctly above some threshold while this threshold gradually decreases starting from 1/num_classes to 1.