09_tabular: Value Error: Unable to coerce to Series, length must be 1: given 0

Same issue here!
Removing Normalize from procs_nn array, seems to “fix” the issue.

The good thing is that it doesn’t look like user error on our part. :joy: The trouble is we need normalization for a neural network.

I’ve been playing with tabular a while and never faced this issue before, ever. I can try and run the fastbook version and see what’s up but I wouldn’t rule out user error to some degree (or book error)

1 Like

Encountering the same error. It works if you leave out the cont_nn variable. Will continue digging into this…

3 Likes

I don’t know if this is the best answer, but don’t think it is right to remove the Normalize processor from procs_nn or to remove cont_nn from the Tabular Pandas call. We need the ‘saleElapsed’ continuous variable and we need to normalize it.

I did notice that

df_nn_final.dtypes

YearMade                 int64
ProductSize           category
Coupler_System          object
fiProductClassDesc      object
Hydraulics_Flow         object
ModelID                  int64
saleElapsed             object
fiSecondaryDesc         object
fiModelDesc             object
Enclosure               object
Hydraulics              object
ProductGroup            object
fiModelDescriptor       object
Drive_System            object
Tire_Size               object
SalePrice              float64
dtype: object

After I changed ‘saleElapsed’ to int64, I was about to move past TabularPandas without the error.

df_nn_final.dtypes

YearMade                 int64
ProductSize           category
Coupler_System          object
fiProductClassDesc      object
Hydraulics_Flow         object
ModelID                  int64
saleElapsed              int64
fiSecondaryDesc         object
fiModelDesc             object
Enclosure               object
Hydraulics              object
ProductGroup            object
fiModelDescriptor       object
Drive_System            object
Tire_Size               object
SalePrice              float64
dtype: object

The rest of the neural networks section ran to conclusion and gave a r_mse of 0.226128

preds,targs = learn.get_preds()
r_mse(preds,targs)
0.226128

Not sure if this is the correct answer to this problem, but it gives a better result than removing cont_nn, which gives a r_mse of 0.270476

preds,targs = learn.get_preds()
r_mse(preds,targs)
0.270476

Can someone more experienced weigh in on this? Perhaps @muellerzr?

Thanks,
Jeff

16 Likes

That does indeed make perfect sense. Great debugging @jeffchen72! Everything works by integrating well with pandas (hence TP), and if it’s not a numerical datatypes then it will break on normalize (which we could expect)

Thanks for confirming this, @muellerzr. Maybe I can propose my fix to 09_tabular as my first PR.

4 Likes

That would be a fantastic idea :smiley:

2 Likes

Great job @jeffchen72! I was certainly not suggesting that that we remove cont_nn , simply that it was related to the error.

As a side note, I got r_mse of 0.224892 by dropping fitModelDesc (which has 5K+ cardinality), instead of fiModelDescriptor

1 Like

Thanks @porich for sharing your results. This is a great notebook. I learned a lot from it.

Jeff

This actually works, thanks @jeffchen72

This may seem like an obvious question but how did you convert to int64. I have been trying to use .astype(int) both on the column and with a for loop looping through each value in the column but it isn’t working. Is there another method I’m missing?

train[‘saleElapsed’] = train[‘saleElapsed’].astype(‘int’)

One remark here, if anyone else has got this error-message:

This is my solution:
I had to add copy() when creating the df_nn_final Dataframe:

I have changed this line:

df_nn_final = df_nn[list(xs_final_time.columns) + [dep_var]]

to:

df_nn_final = (df_nn[list(xs_final_time.columns) + [dep_var]]).copy()

I have this solution on: https://stackoverflow.com/questions/49728421/pandas-dataframe-settingwithcopywarning-a-value-is-trying-to-be-set-on-a-copy

1 Like

Changing to int, like below did not work for me,
df_nn['saleElapsed'] = df_nn['saleElapsed'].astype(int)

but changing to float did! :slight_smile:
df_nn_final = df_nn_final.astype({"saleElapsed": float})

3 Likes

Thx! Worked for me as well.

  1. In your first one you are using dn_nn instead of df_nn_final. But that still gives an error
  2. In your second one you can replace float by int and it will work.
1 Like
  1. It was a mistake during commenting! It was supposed to be df_nn_final. Yes, that line still throws error but using copy() as per @ulat comment seems to work. I didn’t test though.

  2. Yes, it works for both ‘int’ and ‘float’!

Did this change ever get merged in?

thanks, I tried it and it works for me