Test set for multi label classification model?

remapears · October 11, 2019, 5:34pm

I have been successful at training a model for multilabel classification.
Yet, I couldn’t add a testing set to check its performance on a set of new images:

To train the model with cross-validation, I use do the following:

for train_index, val_index in skf.split(df.index, df['Tags']):
  src = (ImageList.from_df(df, '/content/gdrive/My Drive/Dataset/500 dataset together')
       .split_by_idxs(train_index,val_index)
       .label_from_df(label_delim=' '))
  data_fold = (src.transform(None, size=224).databunch().normalize(imagenet_stats))
  learn = cnn_learner(data_fold, models.resnet18, metrics=acc_02)
  learn.fit_one_cycle(7, slice(lr))
  loss,acc = learn.validate()
  acc_val.append(acc.numpy())

Can you please help? Thanks in advance.

PoonamV · October 11, 2019, 7:34pm

Can you tell us what error you are getting?
In docs.fast.ai there is a multi-label classification example .

You can try something like below.

learn.data.add_test(ImageList.from_df(test_df, PATH, folder='test'))
preds,_ = learn.get_preds(ds_type=DatasetType.Test)

Hope this helps.

remapears · October 12, 2019, 6:07am

Dear @PoonamV, Thank you for your quick response. But don’t you think I need to apply all things that were applied for the test data, such as normalization and the corresponding labels to compute test accuracy?

PoonamV · October 14, 2019, 5:30am

normalization will be applied automatically. If you need to apply transformations to test the same as train data. You need to use learn.TTA().

If you have labels, you need to treat it as validation set and add it to Databunch with its labels and compute accuracy. You need to pass as DatasetType.Valid.

kushaj · October 14, 2019, 2:01pm

You can also create a new databunch and then use learn.validate on that databunch. For this databunch use split_none() for no validation split.