Validation loss keeps increasing, and performs really bad on test dataset

jibinmathew69 · May 7, 2019, 9:20am

I am training on chest Xray dataset from kaggle, and I use the training test and validation dataset provided to train and evaluate on Resnet-50. I am loading the data as follows:

data = ImageDataBunch.from_folder(path, train="train", valid='test',test='val', bs=bs, size=224, num_workers=4).normalize(imagenet_stats)

followed by:

learn = cnn_learner(data, models.resnet50, metrics=accuracy, model_dir="/tmp/model/")
learn.lr_find()
learn.recorder.plot()

download

learn.fit_one_cycle(8)

epoch	train_loss	valid_loss	accuracy	time
0	0.267473	0.492386	0.826923	03:39
1	0.142290	0.628095	0.802885	03:37
2	0.085903	0.762643	0.810897	03:37
3	0.050223	0.735109	0.834936	03:38
4	0.030469	0.805609	0.833333	03:37
5	0.019916	0.912937	0.826923	03:37
6	0.010563	0.932671	0.820513	03:38
7	0.006739	0.862673	0.833333	03:37

learn.unfreeze()
learn.fit_one_cycle(6, max_lr=slice(1e-6,1e-4))

epoch	train_loss	valid_loss	accuracy	time
0	0.006664	0.932899	0.833333	03:38
1	0.009007	1.020591	0.822115	03:38
2	0.004957	1.058624	0.810897	03:40
3	0.003255	1.058656	0.814103	03:37
4	0.002129	1.068035	0.814103	03:38
5	0.001938	1.024862	0.825321	03:39

and then the validation on test as follows:
learn.validate(data.test_dl)

which output
[5.0541553, tensor(0.3750)]

How do I improve/ debug the issue, and training on resnet-34 gave 91% accuracy on validation set and 50% on test set. Also is the right way to load and test the testing dataset?

Note: The validation set seemed to be too small, and hence I have switched tests and validation sets.

adeperio · May 7, 2019, 12:21pm

Hi @jibinmathew69 it looks like you may be overfitting (decreasing training loss, increasing/high validation loss).

Maybe start with resnet34 and see how the model performs with a less complex base architecture.
If you continue to overfit, maybe try some other regularisation such as data augmentation, or increasing dropout?

jibinmathew69 · May 7, 2019, 12:52pm

I have already experimented with Resnet-34, and the validation accuracy went up to 91% but the on the test set it show 50%. And considering the nature of the image, rotation, zoom form of augmentation didn’t make sense, since it’s Xray and would have same format all through out.

adeperio · May 7, 2019, 1:13pm

Ahh great. Maybe try increasing drop out and weight decay to see if that improves things?

I also noticed some other things.

In your first fit_one_cycle run, you could choose a value from the learning rate graph to pass into that function. You would pick the highest learning rate from the steepest point before the loss starts to rapidly increase.

So for example, 1e-02 could be a suitable observation.

learn.fit_one_cycle(8, slice(1e-02))

I would also run lr_find again after you unfreeze. Then from that graph, you can choose an appropriate learning rate range to pass into your fine tuning fit_one_cycle run

jibinmathew69 · May 7, 2019, 1:54pm

How do I apply dropout to pre-trained model and also weight decay? Also, could you verify if the test set evaluation is done right?

adeperio · May 8, 2019, 12:37am

In the docs for cnn_learner you can see a parameter called ps This defaults to 0.5 and is the value that controls dropout.

For weight decay look for the wd param

https://docs.fast.ai/vision.learner.html#cnn_learner
https://docs.fast.ai/basic_train.html#Learner

For the test set I normally use TTA to run a test evaluation. Check here in the docs:
https://docs.fast.ai/basic_train.html#Test-time-augmentation

learn.TTA()

It defaults to running on the validation set. But if you want to run it on the test set you write something like this:

learn.TTA(ds_type=DatasetType.Test)

Here are the relevant docs for that: https://docs.fast.ai/basic_train.html#Test-time-augmentation

jibinmathew69 · May 8, 2019, 3:04pm

I tried a dropout of 0.75 and weight decay of 0.1 but nothing much improved.

epoch	train_loss	valid_loss	accuracy	time
0	0.003090	0.961501	0.836538	03:13
1	0.003340	1.014350	0.833333	03:13
2	0.003588	0.984097	0.836538	03:09
3	0.004335	0.963037	0.838141	03:12
4	0.008236	0.971218	0.836538	03:09
5	0.003931	0.998021	0.834936	03:11
6	0.003813	1.057713	0.831731	03:12
7	0.002577	1.012105	0.833333	03:09

jibinmathew69 · May 8, 2019, 3:07pm

this doesn’t seem to work

dipam7 · May 19, 2019, 1:22pm

Hey, I don’t know what you are doing wrong but I have 2 kernels on this dataset if you want to check them out. Links to the kernels:
Pneumonia detection
Mixed precision on pneumonia detection

vaio05 · February 16, 2020, 5:33am

Did you figure out the reason?

hagoth · June 23, 2021, 11:09pm

From experience, when the training set is not tiny (but even more so, if it’s huge) and validation loss increases monotonically starting at the very first epoch, increasing the learning rate tends to help lower the validation loss - at least in those initial epochs.