Thank you @rsomani95 and @jeremy for your replies.
I tried to use my spectrogram data in 2 formats: 224 square and (100,177) rectangular size.
My results for 224 seem to be better:
Initial resnet34, (pretty bad):
epoch |
train_loss |
valid_loss |
error_rate |
1 |
0.647652 |
0.621205 |
0.394737 |
2 |
0.563093 |
1.762814 |
0.552632 |
3 |
0.427048 |
1.189877 |
0.473684 |
After unfreeze and learnfit it gets better:
epoch |
train_loss |
valid_loss |
error_rate |
1 |
0.534642 |
0.357039 |
0.210526 |
Resnet50 gave me the best results so far:
epoch |
train_loss |
valid_loss |
error_rate |
1 |
0.827895 |
0.735053 |
0.473684 |
2 |
0.567242 |
0.594174 |
0.315789 |
3 |
0.427497 |
0.566335 |
0.289474 |
4 |
0.349678 |
0.458205 |
0.157895 |
5 |
0.299800 |
0.379624 |
0.131579 |
6 |
0.257279 |
0.368416 |
0.157895 |
7 |
0.235691 |
0.364216 |
0.105263 |
8 |
0.212819 |
0.347390 |
0.105263 |
unfreeze and learnfit makes it worse:
epoch |
train_loss |
valid_loss |
error_rate |
1 |
1.191806 |
12.630829 |
0.500000 |
2 |
1.661533 |
30.858454 |
0.473684 |
3 |
1.424686 |
14.588722 |
0.473684 |
But for rectangular images, my results are much, much worse:
initial resnet34
epoch |
train_loss |
valid_loss |
error_rate |
1 |
0.826759 |
0.995546 |
0.596491 |
2 |
0.580530 |
0.858085 |
0.543860 |
3 |
0.425833 |
1.251360 |
0.578947 |
After unfreeze and learnfit:
poch |
train_loss |
valid_loss |
error_rate |
1 |
0.993478 |
1.775175 |
0.578947 |
resnet50:
epoch |
train_loss |
valid_loss |
error_rate |
1 |
0.627348 |
1.009493 |
0.526316 |
2 |
0.456382 |
1.192073 |
0.526316 |
3 |
0.329753 |
1.243930 |
0.500000 |
4 |
0.279683 |
1.176981 |
0.473684 |
5 |
0.230862 |
1.421680 |
0.500000 |
6 |
0.192906 |
1.193287 |
0.473684 |
7 |
0.166738 |
1.016759 |
0.342105 |
8 |
0.144671 |
1.161860 |
0.342105 |
gets a little better after unfreeze and learnfit:
epoch |
train_loss |
valid_loss |
error_rate |
1 |
1.220552 |
23.675739 |
0.500000 |
2 |
1.345855 |
5.028001 |
0.289474 |
3 |
1.289851 |
1.882938 |
0.289474 |
I wonder why this is happening? The base data is the same (i.e. spectrogram content) the difference is in shape and size.
So my best results so far 0.1 error rate with resnet50, 200 spectrograms, 2 classes, 0.2 validation.
But when I run the data again, the model changes, it gets worse, then a little better. I know I am supposed to use random seed for more consistent results, but why there is so much variation in results I get? Besides, if we lock validation set, does it not mean that our model does not generalize well? (or maybe I just do not get it, sorry).