Test only data bunch?

devforfu · November 9, 2018, 12:48pm

Recently I’ve been working on the dataset with three subsets: train, validation, and test. I am training my model in the background and save it onto disk. Then, I restore the model in the notebook and inspect its results on the testing dataset. So now I am doing something like:

test_ds = ImageClassificationDataset.from_single_folder('test', classes=[0])
test_bunch = ImageDataBunch.create(test_ds, test_ds, test_ds)
learn = create_cnn(test_bunch, models.renset34, path='/path/to/models')
y_hat, _ = learn.get_preds(DatasetType.Test)

So I wonder, is it a “canonical” approach in such cases? Or there is a more easy way to instantiate a trained model?

The reason why I am asking is that my solution looks a bit “hacky”, and I don’t really need three datasets in the bunch when making tests. Or, at least, I do not need training dataset. I mean, I have separate working spaces and don’t repeat training steps in the validation/testing notebook. The main idea is that I would like to restore the model without additional context.

jbriggs · March 14, 2019, 3:03pm

If you’re using a pre-trained model that you have saved, you can call load_learner(), with the test parameter, which allows you to load only a test set into a databunch for use by the learner.