Predicting on different datasets

What is the recommended way to generate predictions on new test datasets? This may especially be useful when we want to generate predictions after we load a saved model or when we want to test on different datasets.

Here are some options I am currently considering:

  • Run learn.predict on each individual item in the test dataset
  • export the model (instead of using save) and reload it using load_learn with test argument
  • extract model from learner and predict using standard PyTorch
  • create a new learn object with some dummy train/val data and the required test data. Then load the saved weights to this object. The test data may require dummy y-variable too since it is probably required while invoking get_preds.

Some of the suggestions in the older posts like this one don’t seem to be applicable for latest version.

3 Likes

From what I have experienced, the first option takes way too long. Predicting individual items seems to take hundreds or thousands of times slower than predicting using mini-batches. So, your second option appears to be the easiest and fastest to me.

1 Like

Thanks, Cary

How can I do this “extract model from learner and predict using standard PyTorch” ?
Can you help me understand? :slight_smile: