Error in Chapter 9 lesson 7

Mohammed01 · September 1, 2020, 1:04pm

I run all codes of the chapters in collab,
In lesson 7, chapter 9 notebook has just one error.
I searched in stack overflow to fix it, but I couldn’t fix it
and I rerun all code, but there is the same error
Did anyone see this error in this chapter?

xs_filt2 = xs_filt.drop('fiModelDescriptor', axis=1)
valid_xs_time2 = valid_xs_time.drop('fiModelDescriptor', axis=1)
m2 = rf(xs_filt2, y_filt)
m_rmse(m, xs_filt2, y_filt), m_rmse(m2, valid_xs_time2, valid_y)

Returns

ValueError                                Traceback (most recent call last)
<ipython-input-110-3c2713eabff4> in <module>()
      2 valid_xs_time2 = valid_xs_time.drop('fiModelDescriptor', axis=1)
      3 m2 = rf(xs_filt2, y_filt)
----> 4 m_rmse(m, xs_filt2, y_filt), m_rmse(m2, valid_xs_time2, valid_y)
-----------------------------------------------------------------------------------------------
/usr/local/lib/python3.6/dist-packages/sklearn/tree/_classes.py in _validate_X_predict(self, X, check_input)
    389                              "match the input. Model n_features is %s and "
    390                              "input n_features is %s "
--> 391                              % (self.n_features_, n_features))
    392 
    393         return X

ValueError: Number of features of the model must match the input. Model n_features is 15 and input n_features is 14

Sturzgefahr · September 3, 2020, 7:32am

I had the same issue. I believe it may be a typo. Changing the last line to this:

m_rmse(m, xs_filt, y_filt), m_rmse(m2, valid_xs_time2, valid_y) # changed xs_filt2 -> xs_filt

produces an output, but I would need to go through the notebook more thoroughly before I can tell you if it’s the correct output.

Sturzgefahr · September 3, 2020, 7:46am

Someone also noticed this in the MOOC.

Mohammed01 · September 3, 2020, 1:33pm

Thanks, Sturzgefahr. it worked

Mohammed01 · September 3, 2020, 1:38pm

But after this cell,
the next cell:

cat_nn.remove('fiModelDescriptor')

returns error

ValueError                                Traceback (most recent call last)
<ipython-input-105-7422a4dc0a4f> in <module>()
----> 1 cat_nn.remove('fiModelDescriptor')
ValueError: list.remove(x): x not in list

What all edits did you do in this notebook after this error?

Sturzgefahr · September 4, 2020, 4:45am

I didn’t make any other edits. I think you may have run this cell twice

cat_nn.remove('fiModelDescriptor')

You can only run this command once. If you run it twice, it’ll throw an error, since you’re trying to remove something you’ve already removed. My advice is reinitialize cat_nn by running this cell again:

cont_nn,cat_nn = cont_cat_split(df_nn_final, max_card=9000, dep_var=dep_var)

Mohammed01 · September 4, 2020, 8:12am

Thanks again, it worked well

Sturzgefahr · September 4, 2020, 11:37pm

Happy to help.

kklawaa · October 3, 2020, 11:30am

After this cell in the next cell

procs_nn = [Categorify, FillMissing, Normalize]
to_nn = TabularPandas(df_nn_final, procs_nn, cat_nn, cont_nn, splits=splits, y_names=dep_var)

receives the following error in Google colab:

ValueError Traceback (most recent call last)
in ()
1 procs_nn = [Categorify, FillMissing, Normalize]
----> 2 to_nn = TabularPandas(df_nn_final, procs_nn, cat_nn, cont_nn, splits=splits, y_names=dep_var)

14 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/ops/init.py in to_series(right)
464 if len(left.columns) != len(right):
465 raise ValueError(
→ 466 msg.format(req_len=len(left.columns), given_len=len(right))
467 )
468 right = left._constructor_sliced(right, index=left.columns)

ValueError: Unable to coerce to Series, length must be 1: given 0

Did anyone see this error in this chapter as well?

chaseos · October 6, 2020, 1:58am

I think the issue is the continuous column saleElapsed has dtype=object (which you can discover with df_nn_final.info()). You can fix this with by adding df_nn_final.saleElapsed = df_nn_final.saleElapsed.astype(float).

If you get a SettingWithCopyError then change
df_nn_final = df_nn[list(xs_final_time.columns) + [dep_var]]
(from a few cells earlier) to
df_nn_final = df_nn[list(xs_final_time.columns) + [dep_var]].copy()

ballard · October 7, 2020, 6:18am

I think the typo should be correct as:

m_rmse(m2, xs_filt2, y_filt), m_rmse(m2, valid_xs_time2, valid_y) # changed m -> m2

But I’m still working it through so I might be wrong.

kouohhashi · October 17, 2020, 2:11pm

I have same error. have you found a solution?
Never mind.
chaseos is right. My mistake.

Atz · November 5, 2020, 9:01pm

This is correct, you want to change m to m2. This line is supposed to be printing the rmse for the training and validation sets on our newly trained random forest (m2) that no longer uses the fiModelDescriptor feature.

(Changing the line instead from xs_filt2 to xs_filt will make the code run, but you’ll be printing the rmse for the the training set on one model and then for the validation set on another model – probably not what you want to look at.)

Atz · November 5, 2020, 9:10pm

This worked for me. Another way to avoid the SettingWithCopyError is if you do

df_nn_final = df_nn_final.astype({"saleElapsed": float}).