What is the reason behind adding 1 to embedding size while creating cat_sz?


(Yeshwanth Reddy) #1

In lesson3-rossman notebook we have a couple of lines of code that create (category-size, embedding-size)

cat_sz = [(c, len(joined_samp[c].cat.categories)+1) for c in cat_vars]  
emb_szs = [(c, min(50, (c+1)//2)) for _,c in cat_sz]

What is the reason behind adding 1 in [(c, len(joined_samp[c].cat.categories)+1) for c in cat_vars]?


(Yeshwanth Reddy) #2

I’m guessing 0 is reserved for unknown levels, but still not sure