I ran earlier with:

learn = create_cnn(data, models.resnet34, ps=0.1, metrics=error_rate)

learn.fit_one_cycle(24, max_lr=slice(1e-4,1e-2))

and got for the 8th epoch:

8 0.193530 0.075071 0.013333 (00:07)

Then I ran:

learn = create_cnn(data, models.resnet34, ps=0.1, wd=.01, metrics=error_rate)

learn.fit_one_cycle(8, max_lr=slice(1e-4,1e-2))

and got:

8 0.094480 0.009196 0.000000

So both losses were way down.

Then because you said to increase weight decay I tried:

learn = create_cnn(data, models.resnet34, ps=0.1, wd=.1, metrics=error_rate)

and got:

I really don’t understand what’s going on…

I should note: this is generally what I’ve seen with many variations in lr and epochs - the training loss is almost always higher than validation loss, and often much higher. Error rate is usually quite good, so it seems like the model is decent, but consistently underfitting.