Predicting on different datasets

vishalrao · February 17, 2019, 3:44pm

What is the recommended way to generate predictions on new test datasets? This may especially be useful when we want to generate predictions after we load a saved model or when we want to test on different datasets.

Here are some options I am currently considering:

Run learn.predict on each individual item in the test dataset
export the model (instead of using save) and reload it using load_learn with test argument
extract model from learner and predict using standard PyTorch
create a new learn object with some dummy train/val data and the required test data. Then load the saved weights to this object. The test data may require dummy y-variable too since it is probably required while invoking get_preds.

Some of the suggestions in the older posts like this one don’t seem to be applicable for latest version.

BetaGo · February 18, 2019, 10:06am

From what I have experienced, the first option takes way too long. Predicting individual items seems to take hundreds or thousands of times slower than predicting using mini-batches. So, your second option appears to be the easiest and fastest to me.

vishalrao · February 18, 2019, 4:40pm

Thanks, Cary

malina.buga · May 13, 2020, 8:00am

How can I do this “extract model from learner and predict using standard PyTorch” ?
Can you help me understand?