I am creating another thread for the question originally from wiki for Lesson 1, as I think it requires a separate discussion. Reproducing the question as-is:
While looking at the pandas documentation, I see a method called “get_dummies”:
which can convert categorical values to indicator/dummy variables.
I ran it on the bulldozer dataset and the output is similar to “one hot encoding”.
So, I am wondering - which is a better method out of the two? Using train_cats to extract category codes or using get_dummies?