As practice to complement the Rossman fastai lecture, I’m taking part on the Kaggle House Prices competition. I have created a ItemList for feeding the data
data = (TabularList.from_df(df, path=path, cat_names=cat_vars, cont_names=cont_vars, procs=procs)
#.split_by_rand_pct(valid_pct=0.3)
.split_none()
.label_from_df(cols=dep_var, label_cls=FloatList, log=True)
.databunch()
)
x
I can calculate the model and everthing works relatively fine (I did some testing with splitting obviously). I am struggling though to predict the values in the test set. I have done
test_data = TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars, procs=data.processor)
print(len(test_data))
preds = learn.get_preds(test_data)
print(len(preds[0]))
which returns 1459 and 1458. I do not understand two things:
- Why there is a reduction in 1 in the predictions?
- Why do I get back two tensors with get_preds? I have read the documentation and it seems that the first is for predictions and the second for target, although there is no available target for test_data.
Also, if you have any idea of how to solve this, I’d be super interested. Thanks!