Finding test accuracy

Anuraaj · September 26, 2019, 4:13pm

Hi everyone, I am new to deep learning. Expect your help on the following issues please.

I have followed the documentation for adding a test_folder

the code which guided me is

tfms = []
path = Path('data').resolve()
data = (ImageList.from_folder(path)
        .split_by_pct()
        .label_from_folder()
        .transform(tfms)
        .databunch()
        .normalize() ) 
learn = cnn_learner(data, models.resnet50, metrics=accuracy)
learn.fit_one_cycle(5,1e-2)

data_test = (ImageList.from_folder(path)
        .split_by_folder(train='train', valid='test')
        .label_from_folder()
        .transform(tfms)
        .databunch()
        .normalize()
       ) 
learn.validate(data_test.valid_dl)

All the steps were working fine except the line
learn.validate(data_test.valid_dl)
and gives an error as follows

but when I write the code as
learn.validate(data_test.test_dl)

It worked fine and gave the result as follows

Can you please explain this issue and guide me towards a right path.

Further, please explain how to plot confusion matrix for the test accuracy.

Thanks a lot.

muellerzr · September 26, 2019, 4:25pm

See my notebook here. I go into detail why that won’t work

https://github.com/muellerzr/fastai-Experiments-and-tips/blob/master/Test%20Set%20Generation/Labeled_Test_Set.ipynb

Let me know if you have any questions!

muellerzr · September 26, 2019, 4:32pm

Short answer is the test set has no labels though

Anuraaj · September 26, 2019, 4:36pm

Hi @muellerzr thanks for helping. I am looking into your notebook and get back here if I have issues.

Anuraaj · September 26, 2019, 4:41pm

In this case, according to the guidelines in
https://docs.fast.ai/data_block.html#LabelLists.add_test_folder

I have created a test folder and used it as a validation set after finishing my training with a primary image-set by splitting into a validation-set(20%) and train-set(80%). Therefore, when I pass the test set as validation set again, I believe that, must be able to plot the confusion matrix.

Please give your views on it.

muellerzr · September 26, 2019, 4:43pm

Yes. If you overload the validation set the confusion matrix will be there.

Anuraaj · September 26, 2019, 4:53pm

Yes, exactly. This is where I am struggling. When I try to plot the confusion matrix for test set using,
interp = ClassificationInterpretation.from_learner(learn)

it gives the confusion matrix for the previous training. I am pasting the code here.
tfms = get_transforms( do_flip = True, flip_vert = True, max_rotate = 60.0, max_zoom = 1.1, max_lighting = 0.2, max_warp = 0.2, p_affine = 0.75, p_lighting = 0.75)

data_1 = (ImageList.from_folder(path/'train')
.split_by_rand_pct(valid_pct=0.2)
.label_from_folder()
.transform( tfms = tfms, size=224, padding_mode='zeros')
.databunch(bs=bs, num_workers = 4)
.normalize(imagenet_stats) )

learn = cnn_learner(data_1, models.resnet50, metrics = accuracy)
learn.fit_one_cycle(12)

learn.save('trn_val=0.2_tst_res50-1')
learn.load('trn_val=0.2_tst_res50-1');

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix() ##this is the confusion matrix for initial training

now I am creating a new data_bunch for finding test set accuracy
data_test = (ImageList.from_folder(path)
.split_by_folder(train='train', valid='test')
.label_from_folder()
.transform(tfms = tfms, size=224, padding_mode='zeros')
.databunch(bs=bs, num_workers = 4)
.normalize(imagenet_stats)
)

learn.validate(data_test.test_dl)

## Hereafter, I have no idea how to plot confusion matrix for test accuracy

muellerzr · September 26, 2019, 5:04pm

You’re still not quite doing it right. Look at the notebook. When I generate my new test set I do learn.data.valid_dl = data_test.valid_dl. Then you can do learn.validate(), interp, what have you

Anuraaj · September 27, 2019, 1:52pm

Hi @muellerzr,
Huge thanks for your help.
The steps in your notebook worked well and I plotted the losses a well .