i am creating a simulation tool and started with 50k random simulations for which I fitted the model. Now I want to run the simulation based on the prediction for each iteration (I get between 2 and 13 possible scenarios to predict per simulation).
I tried both looping via predict and creating a new set and predicting via get_preds().
My dataframe has ~50 columns and procs are FillMissing and Categorify - so nothing large.
The get_preds() option takes aprox. 20seconds (dataframe contains 8 rows).
data_test = (TabularList.from_df(poss_df, cat_names=cat_var,cont_names=cont_var, procs=procs) .split_none() .label_from_df(cols=dep_var)) data_test.valid = data_test.train data_test = data_test.databunch(bs=64) learn.data.valid_dl = data_test.valid_dl learn.model.eval() preds, y = learn.get_preds(ds_type=DatasetType.Valid)
The loop version takes 11 seconds for 100 iterations - which is still feels like a looot:
t = learn.predict(poss_df.iloc[0,:]):
my machine is nothing special, but I am running on a ‘GeForce 940M’ GPU.
Do you have advice on how to best handle the prediction of small batches?