TabularDataLoaders.test_dl can't be created in place

pl3 · September 4, 2020, 6:10pm

The most straightforward way I’m familiar with for scoring a new dataframe is creating a test_dl, which runs through the procs, and then passing to learner.get_preds():

dl = learner.dls.test_dl(df, device="cuda")
preds = learner.get_preds(dl=dl)[0].numpy()

However, when working with large pandas dataframes, ram can end up being a bottle neck. I realized that learner.dls.test_dl() doesn’t allow passing inplace=True to the new dataset, which ends up making a new copy.

Is there another straightforward way of processing and scoring a dataframe that doesn’t make a copy?

muellerzr · September 4, 2020, 7:38pm

Can you flag open an issue for this in the GitHub? It’s a pretty good idea to me.

pl3 · September 4, 2020, 9:15pm

Thanks for the suggestion, added with a reference to this post.