I have test data test1 = TabularList.from_df(test, cat_names=cat_names, cont_names=cont_names)
and databunch with train and test data data1 = (TabularList.from_df(train.reset_index().drop('index',axis=1).iloc[0:1000], cat_names=cat_names, cont_names=cont_names, procs=procs) .random_split_by_pct(0.33) .label_from_df(cols = 'age') .add_test(test1, label='age') .databunch())
I made a learner like in lesson 4 and couldn’t understand how to infere all test data to neural net. learn.get_preds() get prediction for validation data. learn.pred_batch() get prediction for validation data for one batch. learn.predict(test.iloc[0]) get prediction only for 1 row. And throw an error when I try to put there a slice.
learn.get_preds(test1) surpisingly get prediction for validation data. again.
Of course I can do prediction row by row in cycle (and this is very slow!), but there should be faster and better way?
It seems that correct code is learn.get_preds(DatasetType.Test)
but now I have error TypeError: batch must contain tensors, numbers, dicts or lists; found <class 'NoneType'>
data1.show_batch(rows = 5, ds_type=DatasetType.Test) give reasonable result, why learn.get_preds(DatasetType.Test) not working? What can I change?
Replying myself.
get_preds not worked on my data because I have multiclassification problem, not binary.
It seems to me, that now fast.ai not acceptable for multiclass data problems
I use a loop for each row of my new test dataset according to the example here. It takes a long time, and probably not really efficient.
Is there a better way to perform predictions on an entire dataframe? I mean a new dataframe that was not part of the original train\validation\test sets?