Fastest way to get tabular data predictions (small batches)

Hi,
i am creating a simulation tool and started with 50k random simulations for which I fitted the model. Now I want to run the simulation based on the prediction for each iteration (I get between 2 and 13 possible scenarios to predict per simulation).
I tried both looping via predict and creating a new set and predicting via get_preds().

My dataframe has ~50 columns and procs are FillMissing and Categorify - so nothing large.
The get_preds() option takes aprox. 20seconds (dataframe contains 8 rows).

data_test = (TabularList.from_df(poss_df, cat_names=cat_var,cont_names=cont_var, procs=procs)
                    .split_none()
                    .label_from_df(cols=dep_var))
data_test.valid = data_test.train
data_test = data_test.databunch(bs=64)
learn.data.valid_dl = data_test.valid_dl
learn.model.eval()
preds, y = learn.get_preds(ds_type=DatasetType.Valid)

The loop version takes 11 seconds for 100 iterations - which is still feels like a looot:

t = learn.predict(poss_df.iloc[0,:]):

my machine is nothing special, but I am running on a ‘GeForce 940M’ GPU.

Do you have advice on how to best handle the prediction of small batches?