Data augmentation for tabular data

What are some of the effective data augmentation techniques for non-image data?
Trying to do semi-supervised learning on non-image data with MixMatch and FixMatch like techniques, which all require multiple ways to do data augmentation. For tabular data which columns are not meaningful (like embeddings) is there any other way to augment other than adding gaussian noise?

Thanks!

2 Likes

there are several I have used with tabular models in fast.ai, check out the smote library.

1 Like

@stella , thank you for posting this topic.

I have tried swap noise , mixmatch , but found that they were not helping much. Again i think it all depends on the data that you are applying it on.

Finally i ended up adding gaussian noise. Really keen too hear from others, what worked and what did not work.

@harikrishnanrajeev how much adding gaussian noise improved your metric with respect to a non augmented dataset?

with respect to non augmented dataset, gaussian noise augmentation helped add 3 to 3.5 points in accuracy