learn.get_preds(DatasetType.Train) for tabular data returning very noisy and incorrect values

ptats · August 27, 2020, 7:20am

I am trying to get predictions on the training dataset for a tabular learner but it is returning very noisy values that seem incorrect. Even the target values are completely wrong. I have no idea what is going on. Can anyone help?

My code is:

preds, target = learner.get_preds(DatasetType.Train)

And i have plotted the target values from the learner and the true target values from the df:

ptats · August 27, 2020, 7:21am

Ignore the fact the df values extend past the learner values as I have not filtered out my validation set.

ptats · August 27, 2020, 7:23am

Digging around I saw that this may be related to the training set being shuffled. Is there a way to disable shuffling?

ptats · August 27, 2020, 10:00am

Solved with setting shuffle=False in the databunch.

Kornel · August 27, 2020, 12:28pm

I did not found DatasetType in fastai2 repo, try:

preds, target = learner.get_preds(ds_idx=0)

ptats · August 28, 2020, 12:37am

I am using fastai version 1, sorry should have mentioned that.

muellerzr · August 28, 2020, 1:01am

In v2 to fully get it you would do learn.get_preds(ds_idx=0, reorder=True) to reorder the shuffling (I think that’s the default? Just exposing the param )