How to do Prediction on House Prices competition in Kaggle

PabloMC · August 23, 2020, 8:49pm

As practice to complement the Rossman fastai lecture, I’m taking part on the Kaggle House Prices competition. I have created a ItemList for feeding the data

data = (TabularList.from_df(df, path=path, cat_names=cat_vars, cont_names=cont_vars, procs=procs)
               #.split_by_rand_pct(valid_pct=0.3)
               .split_none()
               .label_from_df(cols=dep_var, label_cls=FloatList, log=True)
               .databunch()
   )

x

I can calculate the model and everthing works relatively fine (I did some testing with splitting obviously). I am struggling though to predict the values in the test set. I have done

test_data = TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars,     procs=data.processor)
print(len(test_data))
preds = learn.get_preds(test_data)
print(len(preds[0]))

which returns 1459 and 1458. I do not understand two things:

Why there is a reduction in 1 in the predictions?
Why do I get back two tensors with get_preds? I have read the documentation and it seems that the first is for predictions and the second for target, although there is no available target for test_data.

Also, if you have any idea of how to solve this, I’d be super interested. Thanks!