House prices - index out of range

I was trying to do a model using structured data for House Prices competition.

The model works on training data, but it seems there is a problem with my dataset, since when I run:

pred_test=model_learner.predict(is_test=True)

I get:

RuntimeError: index out of range at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:277

The full code (with the error message) is here.
I based on the lesson_3 Rossmann notebook.

I’ve got the same error for the training data earlier, because I removed +1 from

categories_sizes = [(c, len(train_df[c].cat.categories)+1) for c in categorical_variables]

Does anyone has any idea what might be the reason for this?

Same problem here, any idea?

Same problem …

Found the Issue.
Seems like something is wrong with the rmse function.
It worked for me when I changed it to:

def rmse(x,y):

return math.sqrt(((x-y)**2).mean())

I was stuck with this and the issue was after converting the user id’s from to to n-1, I was writing to a different variable and was using the same older one :frowning:

In my case problem was that I coded categorical features in test set in other way that I did it in train set. In other words I did this:

for v in cat_cols: train_df[v] = train_df[v].astype('category').cat.as_ordered()
for v in cat_cols: test_df[v] = test_df[v].astype('category').cat.as_ordered()

and it was the mistake. The solution was to code to categorical just train set and after this did apply_cats() function

for v in cat_cols: train_df[v] = train_df[v].astype('category').cat.as_ordered()
apply_cats(test_df, train_df)
2 Likes