I’m not sure how to specify that the test dataset should be used in learn.get_preds() or learn.TTA() now that ‘is_test’=True has been removed. I’ve tried:
preds = learn.get_preds(ds_type=‘test’)
which seems to be the right way according to the source code comment
But the number of samples in the output preds doesn’t match up to the test_ds size.
Also tried:
learn.get_preds(ds_type=learn.data.test_dl)
learn.get_preds(ds_type=learn.data.test_ds)
Anyone else having issues with this? Any feedback would be appreciated.
I am trying to run predictons for an image classifer on a separate test set but am not sure on the workflow.
I created a ClassificationLearner, trained the model OK. Now I want to run predictions on a test set, and have done the following:
#reload the training data
df_train = pd.read_csv(PATH/'train.csv')
#recreate the data bunch
data = ImageDataBunch.from_df(PATH, df_train, folder='train', ds_tfms=tfms, size=224)
#recreate the learner
learn = create_cnn(data, arch, metrics=error_rate)
#now I can load the previously saved weights
learn.load('wgts-rn34')
#here I add a test folder containing test images - this resets the learn.data object from ImageDataBunch to ImageSplitData - causing the issue I am having
learn.data = (src.add_test_folder('test'))
#now run predictions:
pred_test, y_test = learn.get_preds(ds_type=DatasetType.Test)
But get the error: AttributeError: ‘ImageSplitData’ object has no attribute ‘dl’ as when I set learn.data I canged it from a DataBunch.
But I cant work out how to add the test folder after the model has been built.
I did manage to get this to work: 1) by concatenating the df_train and df_test dataframes together to make df_tt, then had to speciy the parent to the test folder.
folder structure:
PATH
train
test
data = ImageDataBunch.from_df(PATH, df_tt, folder='train', test='../test', ds_tfms=tfms, size=224)
pred_test, y_test = learn.get_preds(ds_type=DatasetType.Test)
Admittedly having to specify the parent to test seems a bit odd and may be as I did something dumb to require this. Have you tried a debugger to step through the code to see if 1) the images in the directories are being loaded?
I’m trying to predict on test set, which I have downloaded using kaggle API, and I’m doing it as @AlisonDavey has posted. preds = learn.TTA(ds_type=DtasetType.Test)[0]
Perhaps just specify the new test folder when recreating the data bunch?e.g. data = ImageDataBunch.from_df(PATH, df_train, folder='train', ds_tfms=tfms, size=224, test='test')
Then create the learner and load the weights. The test dataset should still be in the learner e.g. learn.data.test_ds
Signature: learn.get_preds(ds_type:fastai.basic_data.DatasetType=<DatasetType.Valid: 2>, with_loss:bool=False, n_batch:Union[int, NoneType]=None, pbar:Union[fastprogress.fastprogress.MasterBar, fastprogress.fastprogress.ProgressBar, NoneType]=None) -> List[torch.Tensor]
Docstring: Return predictions and targets on the valid, train, or test set, depending on `ds_type`.
File: /opt/conda/lib/python3.6/site-packages/fastai/basic_train.py
Type: method
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in set_dl(self, dl)
254 "Set the current `dl` used."
255 if hasattr(self, 'cb_dl'): self.callbacks.remove(self.cb_dl)
--> 256 if isinstance(dl.dataset, Callback):
257 self.callbacks.append(dl.dataset)
258 self.cb_dl = dl.dataset
AttributeError: 'NoneType' object has no attribute 'dataset'
can anyone tell me what is going wrong? been stuck here for a long time
Hi, did you solve this? I m also facing this issue. It seems this is the most common way anyone will use this API but, some thing is missing in the doc. for now I m trying to get pred one by one.