Text augmentation

Rajeev21c · November 18, 2021, 1:06pm

Can anybody give some insights into the current text augmentation techniques in practice.

ilovescience · November 18, 2021, 10:29pm

@darek.kleczek won a Kaggle competition by adapting some computer vision augmentations to text:

Hopefully this is helpful.

Rajeev21c · November 21, 2021, 3:47am

Yes, that’s creative to extrapolate image augmentation techniques to text

MissoeMassa · November 21, 2021, 2:29pm

There are a couple of text augmentation techniques I heard of ( I did not implement these techniques myself yet):

Shuffle your text
Train a word2vec embedding, and use this to alter your text with synonyms.
Translate your text to another language and then translate it back to your original language.

Perhaps you can combine several techniques. Hopefully this helps

msivanes · November 23, 2021, 11:24pm

Rajeev21c · November 30, 2021, 1:11pm

Those were helpful!

Chapel · December 7, 2021, 1:09pm

Replace a few words with their synonyms.
Replace a few words with words that have similar (based on cosine similarity) word embeddings (like word2vec or GloVe) to those words.
Replace words based on the context using powerful transformer models (BERT).