Get predictions for oot dataset

Hey guys,

I built a binary classificator with tabular data. For training and testing I used one dataset containing the target column. Now I have an out-of-time dataset, containing the same columns but not the target column. Now I have to preprocess this dataset in the same manner like the first one and generate predictions.
I tried to put my pandas-Dataframe in it but that was not working.
How can I use get_preds with my oot-dataframe?

Thanks a lot!

You need to build a test_dl to preprocess the data then pass it to get_preds. See the end of this notebook (you do not need to export the learner, just follow the test_dl and get_preds steps:

1 Like

I still do not really get it.
I got my oot_dataset as a pandas Dataframe (with 10000 rows) in the oot variable and
did the following:
dl_test = learn.dls.test_dl(oot)
preds = learn.get_preds()

However then preds is a tuple of 2 of two torch tensors, with each 2000 tuples of size 2.
But I expected preds to have 10000 elements which the prediction of 0 or 1.

You need to pass that DataLoader to it, so fastai knows to use it :wink:

So:

preds = learn.get_preds(dl=dl_test)

Also you won’t receive this. You will receive the raw probabilities of it being in either class 0 or class 1. To get the 0 or 1 take the argmax of those predictions like so:

preds = preds.argmax(dim=1)
1 Like

awesome, that solved everything.
I just had to use: preds = preds[0].argmax(dim=1)
because get_preds gave back a tuple with two elements while preds[0] contains the predictions for the classes and preds[1] is None.

Correct. Generally (if your test included labels) this would be your labels. Since there are none, the default is just None (in v1 this was all zeros, which caused people some headaches :slight_smile: )

1 Like