Airbnb challenge similar to rossman

(shriram) #1

I was working on the airbng challenge on kaggle. I used the followong steps for data cleaning and feature engineering. In the process of doing that I lost all my column names. If anybody could look at these images and tell me where I went wrong it would be great.

(Theodoros Galanos) #2

It’s a bit easier if you post the notebook itself, so that people can experiment with the output of each cell. Have you tried printing the tables in between the cells? That should give you some info on what is going on at each step.

That said, near the end you are assigning the values of your dataframe to the vals variable. I think that passes over a numpy array with no column information. Hope that helps.

(shriram) #3

you can find the link to the notebook below, I am really stuck now so any suggestions on how to move forward will be really helpful. I am particularly having trouble with the categorical and continuous variables.

(Sven) #4

In your notebook you defined x_test in this cell:

piv_train = len(target) #marker
vals = df_all.values
le = LabelEncoder()

x = vals[:piv_train]
y = le.fit_transform(target.values)
x_test = vals[piv_train:]

but since it’s taking the rows from vals which comes from df_all.values it has no column names anymore. Look at this example:

(shriram) #5

i have updated some of the code based on suggestions. but now i am getting a .cat error. you can find the code in the airbnb update on the link above