One Hot Encoding Chapter 6

KevinB · May 15, 2020, 12:59am

In the past I’ve seen one-hot encoding represented as a single high value and the rest of the values as a 0, but I noticed in chapter 6 of the fastbook, the book explains it as a list of zeros with a one in any position where that category is present.

Is this actually considered one-hot encoding since there is more than one high value or would this be a different type of encoding?

Wikipedia mentions that having multiple high values in a list is called a dummy variable rather than a one-hot encoding.

A one-hot encoding is important for single-label items to remove any appearance of ordinality where none exists. Dummy variables are similar but can have multiple high values representing the existence of each item.

DanielLam · May 15, 2020, 2:37pm

“Is this actually considered one-hot encoding since there is more than one high value or would this be a different type of encoding?”

In the conventional sense, it isn’t. One-hot encoding is how you described it in the first paragraph, and has a long history from electronics.

bopoort · October 17, 2023, 5:21pm

Thanks for asking @KevinB ! This really threw me off too. Since one hot encoding is used with a different meaning in the book, I would recommend adding an explanation. Thanks for confirming @DanielLam .