Hey man! Your messages, like once, are clarified as cutting butter with a warm knife. Thanks!
Your explanation is wonderful, but I already figured out this by looking at the source code, and by reading this:
I have kept reading those explanations so many often, but it ain’t feeling so clear to me. If I don’t unfreeze()
, would it mean that the “body” stays untrained?
This is definitely an interesting observation and it is likely that indeed the datasets closely resembles ImageNet so you don’t really need to adjust the body of the model. As a side point, make sure your validation metrics are improving, not just your training metrics.
Yes, both the valid_loss
and the metrics go better, without overfitting.
epoch | train_loss | valid_loss | ssim | psnr | time |
---|---|---|---|---|---|
0 | 0.236554 | 0.208531 | 0.183433 | 11.753738 | 02:09 |
1 | 0.179139 | 0.148908 | 0.236383 | 13.805398 | 02:05 |
2 | 0.164469 | 0.179590 | 0.292767 | 12.279366 | 02:06 |
3 | 0.144973 | 0.139101 | 0.338365 | 14.361961 | 02:05 |
4 | 0.132566 | 0.115691 | 0.383597 | 15.746257 | 02:05 |
5 | 0.123076 | 0.140860 | 0.408780 | 14.752694 | 02:05 |
6 | 0.124454 | 0.101365 | 0.442580 | 16.926208 | 02:05 |
7 | 0.112748 | 0.102430 | 0.499645 | 16.352867 | 02:04 |
8 | 0.108559 | 0.099927 | 0.507152 | 17.267815 | 02:04 |
9 | 0.108745 | 0.096709 | 0.531147 | 17.573277 | 02:02 |
10 | 0.131972 | 0.105292 | 0.546981 | 16.911779 | 02:03 |
11 | 0.122663 | 0.106509 | 0.569845 | 17.131609 | 02:06 |
12 | 0.117344 | 0.116483 | 0.586394 | 16.417915 | 02:06 |
13 | 0.118717 | 0.096780 | 0.601940 | 17.860203 | 02:02 |
14 | 0.109777 | 0.108330 | 0.622874 | 16.764353 | 02:02 |
15 | 0.107149 | 0.086118 | 0.636838 | 18.321262 | 02:02 |
16 | 0.099475 | 0.098193 | 0.657461 | 17.738472 | 02:04 |
17 | 0.100719 | 0.132301 | 0.639843 | 15.730131 | 02:04 |
18 | 0.111611 | 0.107994 | 0.666687 | 16.818670 | 02:03 |
19 | 0.111105 | 0.120799 | 0.679617 | 16.494234 | 02:03 |
20 | 0.114704 | 0.207559 | 0.639382 | 12.127844 | 02:04 |
21 | 0.099970 | 0.098086 | 0.707660 | 17.849594 | 02:03 |
22 | 0.099835 | 0.089594 | 0.700292 | 18.560604 | 02:01 |
23 | 0.106707 | 0.138295 | 0.674177 | 15.729656 | 02:04 |
24 | 0.099064 | 0.073961 | 0.725095 | 20.033642 | 02:03 |
25 | 0.102635 | 0.084740 | 0.737844 | 18.210724 | 02:06 |
26 | 0.108957 | 0.155416 | 0.677125 | 14.684962 | 02:01 |
27 | 0.100712 | 0.096362 | 0.740240 | 18.032732 | 02:02 |
28 | 0.091303 | 0.098734 | 0.757668 | 17.141289 | 02:03 |
29 | 0.095276 | 0.082612 | 0.747881 | 18.612074 | 02:01 |
30 | 0.090979 | 0.079308 | 0.766427 | 19.219879 | 02:02 |
31 | 0.088959 | 0.062391 | 0.781548 | 21.137724 | 02:05 |
32 | 0.092504 | 0.070026 | 0.772093 | 19.750147 | 02:01 |
33 | 0.087208 | 0.074354 | 0.783176 | 19.588976 | 02:05 |
34 | 0.083044 | 0.066101 | 0.794248 | 19.470947 | 02:01 |
35 | 0.080781 | 0.075523 | 0.785750 | 18.851494 | 02:04 |
36 | 0.076060 | 0.088846 | 0.795860 | 16.493053 | 02:01 |
37 | 0.074469 | 0.068146 | 0.805137 | 19.896648 | 02:04 |
38 | 0.071076 | 0.070530 | 0.806764 | 20.062801 | 02:03 |
39 | 0.075082 | 0.070611 | 0.802137 | 20.234707 | 02:01 |
40 | 0.076525 | 0.073114 | 0.810220 | 18.325096 | 02:05 |
41 | 0.070763 | 0.070393 | 0.813122 | 18.819860 | 02:05 |
42 | 0.068504 | 0.073804 | 0.805940 | 18.706667 | 02:05 |
43 | 0.065610 | 0.068306 | 0.816073 | 19.237278 | 02:02 |
44 | 0.060239 | 0.064983 | 0.826524 | 19.970503 | 02:04 |
45 | 0.064982 | 0.084969 | 0.812253 | 18.099051 | 02:03 |
46 | 0.059241 | 0.054122 | 0.831039 | 21.738705 | 02:05 |
47 | 0.056639 | 0.056747 | 0.836223 | 22.513317 | 02:02 |
48 | 0.057209 | 0.046792 | 0.846990 | 24.197918 | 02:05 |
49 | 0.056328 | 0.046437 | 0.846901 | 23.958132 | 02:03 |
50 | 0.054749 | 0.047320 | 0.850570 | 24.019894 | 02:03 |
51 | 0.051009 | 0.047988 | 0.852203 | 23.820242 | 02:05 |
52 | 0.050929 | 0.047600 | 0.850500 | 24.071835 | 02:06 |
53 | 0.051956 | 0.043077 | 0.862234 | 24.669596 | 02:03 |
54 | 0.050984 | 0.043307 | 0.857978 | 24.458191 | 02:06 |
55 | 0.048370 | 0.046028 | 0.858897 | 23.934223 | 02:04 |
56 | 0.048205 | 0.044067 | 0.861460 | 24.493036 | 02:06 |
I reached up to ssim
=0.90.
I used fit_one_cycle()
but I tried to squeeze it even more by using a callback
ReduceLROnPlateau()
, but it seems to not work on the case of fit_one_cycle
(the lr
doesn’t get updated). So then I tried fit()
, on which that callback
does work, but the performance is slightly lesser than with fit_one_cycle()
. What am I doing wrong, then?