Data augmentation/batch norm on a non-image dataset


I have a dataset of about 10000 samples in 200 classes, and each sample is a 7000x4 matrix. These are not from images. Is it possible to do data augmentation and batch normalization on data of this type? If so, any insight on how to go about this would be greatly appreciated.

Hi @randy912, you can use batchnorm. One of the advantages of including batchnorm is it converges faster.If the data is not images it may not make sense to use augmentation.

Thank you for the reply, @rteja1113!

IMHO, you can still rotate, flip, or shear the matrix as augmentation, as long as doing this “make sense”. It depends on what is in your matrices.

For example, if the samples are people’s voice and you are using models to catch the words they are saying, then it is okay to speed up/slow down, change pitch, add a little noise, etc.

1 Like

Thanks for the reply, @shushi2000. The data is from time series (each row a sample, each column a time point).

Yeah, totally agree with @shushi2000, In the current Quora Duplicate question detection in Kaggle people used augmentation
The format of the data is below
question_1, question_2, is_duplicate

If question_1 is a duplicate/not_duplicate of question_2, then question_2 is also duplicate/not_duplicate of question_1.This kind of augmentation is analogous to image flipping .