Using the test dataset in learn.get_preds() without is_test=True

ramanan · November 11, 2018, 4:00am

I’m not sure how to specify that the test dataset should be used in learn.get_preds() or learn.TTA() now that ‘is_test’=True has been removed. I’ve tried:
preds = learn.get_preds(ds_type=‘test’)
which seems to be the right way according to the source code comment

But the number of samples in the output preds doesn’t match up to the test_ds size.

Also tried:
learn.get_preds(ds_type=learn.data.test_dl)
learn.get_preds(ds_type=learn.data.test_ds)

Anyone else having issues with this? Any feedback would be appreciated.

I’m on fastai-1.0.22

arunslb123 · November 11, 2018, 5:07am

Try this preds,y,losses = learn.get_preds(ds_type=DatasetType.Test, with_loss=True)

adrian · November 11, 2018, 9:04am

I am trying to run predictons for an image classifer on a separate test set but am not sure on the workflow.

I created a ClassificationLearner, trained the model OK. Now I want to run predictions on a test set, and have done the following:

#reload the training data
df_train = pd.read_csv(PATH/'train.csv')
#recreate the data bunch
data = ImageDataBunch.from_df(PATH, df_train, folder='train', ds_tfms=tfms, size=224)
#recreate the learner
learn = create_cnn(data, arch, metrics=error_rate)
#now I can load the previously saved weights
learn.load('wgts-rn34')
#here I add a test folder containing test images - this resets the learn.data object from ImageDataBunch to ImageSplitData - causing the issue I am having
learn.data = (src.add_test_folder('test'))
#now run predictions:
pred_test, y_test = learn.get_preds(ds_type=DatasetType.Test)

But get the error: AttributeError: ‘ImageSplitData’ object has no attribute ‘dl’ as when I set learn.data I canged it from a DataBunch.

But I cant work out how to add the test folder after the model has been built.

I did manage to get this to work: 1) by concatenating the df_train and df_test dataframes together to make df_tt, then had to speciy the parent to the test folder.

folder structure:

PATH
        train
        test

data = ImageDataBunch.from_df(PATH, df_tt, folder='train', test='../test', ds_tfms=tfms, size=224)
pred_test, y_test = learn.get_preds(ds_type=DatasetType.Test)

mayank4 · November 11, 2018, 12:04pm

I tried running this but get_preds is not returning any predicted labels.

adrian · November 11, 2018, 12:49pm

Admittedly having to specify the parent to test seems a bit odd and may be as I did something dumb to require this. Have you tried a debugger to step through the code to see if 1) the images in the directories are being loaded?

AlisonDavey · November 11, 2018, 7:07pm

This worked for me:
preds = learn.TTA(ds_type=DatasetType.Test)[0]

pradeepvasamsetti · November 12, 2018, 1:56pm

Hi,

I’m trying to predict on test set, which I have downloaded using kaggle API, and I’m doing it as @AlisonDavey has posted. preds = learn.TTA(ds_type=DtasetType.Test)[0]

I have 40669 images in my test set.
Can anyone tell me what 3/8 and 256/424 mean?

balnazzar · November 12, 2018, 6:03pm

tell us your batch size.

AlisonDavey · November 12, 2018, 7:21pm

8 is the number of augmented versions. 424 is the number of items in the test dataset divided by your batch size.

pradeepvasamsetti · November 13, 2018, 5:50am

bs = 8

ramanan · November 13, 2018, 6:17am

Thanks @arunslb123 and @AlisonDavey
ds_type=DatasetType.Test works for me and with_loss=True option is new to me.

Btw sorry for the delayed response, I was trying to use GCP …

ramanan · November 13, 2018, 6:36am

Perhaps just specify the new test folder when recreating the data bunch?e.g.
data = ImageDataBunch.from_df(PATH, df_train, folder='train', ds_tfms=tfms, size=224, test='test')

Then create the learner and load the weights. The test dataset should still be in the learner e.g. learn.data.test_ds

balnazzar · November 13, 2018, 7:00pm

You got it, as @AlisonDavey suggests above

pradeepvasamsetti · November 14, 2018, 5:56am

Then in that case 424 * 8 should give me total test images which would be 3392. But, as I have told earlier my test samples were 40669.

ramanan · November 14, 2018, 6:06am

Double check your test dataset size and batch size e.g. len(learn.data.test_ds)
learn.data.bs

mukeshjangir · November 16, 2018, 12:50pm

This isn’t working for me. It is still making predictions on validation set. I’m using fastai version 1.0.24.

soco_loco · November 30, 2018, 10:47am

Same problem on 1.0.30

Maybe it’s the DatasetType.Valid component?:

Signature: learn.get_preds(ds_type:fastai.basic_data.DatasetType=<DatasetType.Valid: 2>, with_loss:bool=False, n_batch:Union[int, NoneType]=None, pbar:Union[fastprogress.fastprogress.MasterBar, fastprogress.fastprogress.ProgressBar, NoneType]=None) -> List[torch.Tensor]
Docstring: Return predictions and targets on the valid, train, or test set, depending on `ds_type`.
File:      /opt/conda/lib/python3.6/site-packages/fastai/basic_train.py
Type:      method

sugato · May 22, 2019, 8:34am

hi… I am new to fastai library and I am having problems regarding get_preds() function
i have declared my data as follows

    data = (ImageList.from_folder('path/train')
                     .split_by_rand_pct()
                     .label_from_folder()
                     .transform(get_transforms(),size=224)
                     .add_test_folder('path/test')
                     .databunch().normalize(imagenet_stats))

and tried to find out prediction results by using

log_preds_test,y = learn.get_preds(ds_type=DatasetType.Test,with_loss=True)
log_preds_test

but I always get the following error

/usr/local/lib/python3.6/dist-packages/fastai/callback.py in set_dl(self, dl)
    254         "Set the current `dl` used."
    255         if hasattr(self, 'cb_dl'): self.callbacks.remove(self.cb_dl)
--> 256         if isinstance(dl.dataset, Callback):
    257             self.callbacks.append(dl.dataset)
    258             self.cb_dl = dl.dataset

AttributeError: 'NoneType' object has no attribute 'dataset'

can anyone tell me what is going wrong? been stuck here for a long time

nikhil.ikhar · May 29, 2019, 5:12pm

Hi, did you solve this? I m also facing this issue. It seems this is the most common way anyone will use this API but, some thing is missing in the doc. for now I m trying to get pred one by one.

sugato · May 30, 2019, 12:31pm

hi, this seemed to work for me

1.saved the model

model.export()

2.created test set

test=(ImageList.from_folder('path/test'))

loaded the model

learn=load_learner('path/train',test=test)

4.predictions

log_preds_test= learn.get_preds(ds_type=DatasetType.Test)