Audio Classification using 10seconds audios

Hi
I have a dataset with 17 classes in train,test and dev as subfolders
i have used VAD to extract 10seconds files with 50% overlapping. however the number of files reached 231k in training, 14k in test and 3k in dev.

I am trying to use 1d cnn or cnn to train a model.

any recommendation on how to approach this ? I have google colab pro+ so i am not worried about hardware.

1 Like