Text augmentation

Can anybody give some insights into the current text augmentation techniques in practice.

@darek.kleczek won a Kaggle competition by adapting some computer vision augmentations to text:

Hopefully this is helpful.


Yes, that’s creative to extrapolate image augmentation techniques to text :smiley:

There are a couple of text augmentation techniques I heard of ( I did not implement these techniques myself yet):

  • Shuffle your text
  • Train a word2vec embedding, and use this to alter your text with synonyms.
  • Translate your text to another language and then translate it back to your original language.

Perhaps you can combine several techniques. Hopefully this helps :slight_smile:

This provides a good survey on Data Augmentation.
Data Augmentation Approaches in Natural Language Processing: A Survey