Can anybody give some insights into the current text augmentation techniques in practice.
@darek.kleczek won a Kaggle competition by adapting some computer vision augmentations to text:
Hopefully this is helpful.
Yes, that’s creative to extrapolate image augmentation techniques to text
There are a couple of text augmentation techniques I heard of ( I did not implement these techniques myself yet):
- Shuffle your text
- Train a word2vec embedding, and use this to alter your text with synonyms.
- Translate your text to another language and then translate it back to your original language.
Perhaps you can combine several techniques. Hopefully this helps
This provides a good survey on Data Augmentation.
Data Augmentation Approaches in Natural Language Processing: A Survey