I don’t know if this is the best answer, but don’t think it is right to remove the Normalize processor from procs_nn or to remove cont_nn from the Tabular Pandas call. We need the ‘saleElapsed’ continuous variable and we need to normalize it.
I did notice that
df_nn_final.dtypes
YearMade int64
ProductSize category
Coupler_System object
fiProductClassDesc object
Hydraulics_Flow object
ModelID int64
saleElapsed object
fiSecondaryDesc object
fiModelDesc object
Enclosure object
Hydraulics object
ProductGroup object
fiModelDescriptor object
Drive_System object
Tire_Size object
SalePrice float64
dtype: object
After I changed ‘saleElapsed’ to int64, I was about to move past TabularPandas without the error.
df_nn_final.dtypes
YearMade int64
ProductSize category
Coupler_System object
fiProductClassDesc object
Hydraulics_Flow object
ModelID int64
saleElapsed int64
fiSecondaryDesc object
fiModelDesc object
Enclosure object
Hydraulics object
ProductGroup object
fiModelDescriptor object
Drive_System object
Tire_Size object
SalePrice float64
dtype: object
The rest of the neural networks section ran to conclusion and gave a r_mse of 0.226128
preds,targs = learn.get_preds()
r_mse(preds,targs)
0.226128
Not sure if this is the correct answer to this problem, but it gives a better result than removing cont_nn, which gives a r_mse of 0.270476
preds,targs = learn.get_preds()
r_mse(preds,targs)
0.270476
Can someone more experienced weigh in on this? Perhaps @muellerzr?
Thanks,
Jeff