Share your work here ✅

bachir · October 30, 2018, 11:26am

I had bad luck with birds and flowers, now I tried sounds and the results seems quite promising. I trained a classifier on spectrogram images generated from audio files that I downloaded from this Kaggle competition.
With a ResNet-34 and 4 epochs:

Total time: 36:42
epoch  train_loss  valid_loss  error_rate
1      2.823842    1.935167    0.541053    (27:39)
2      1.968809    1.414007    0.408421    (03:00)
3      1.570557    1.216676    0.344211    (03:01)
4      1.380666    1.171882    0.330526    (03:01)

The top losses are

The confusion matrix looks OK

After unfreezing and choosing a good slice of learning rates, I got even better results:

Total time: 25:07
epoch  train_loss  valid_loss  error_rate
1      1.272060    1.071349    0.293684    (03:08)
2      1.148295    0.994182    0.280526    (03:09)
3      1.040785    0.941049    0.264737    (03:08)
4      0.834645    0.837393    0.224737    (03:08)
5      0.664606    0.752477    0.205789    (03:08)
6      0.499639    0.716157    0.198421    (03:08)
7      0.399242    0.692799    0.188421    (03:07)
8      0.339768    0.671222    0.184737    (03:08)

Jupyter notebook - link