Determining dimensions of embeddings?

I am at lesson 5 and this is regarding embeddings!

How do I guess in advance, how many latent features are hidden in my categorical variables? In other words, Is there any thumb rule to determine the dimension of my embeddings?

Thanks,

Hi Milind. Please do a forum search for “embedding size”. There has already been lots of discussion of this question. HTH, Malcolm

The rule of thumb is to use fastai’s rule of thumb defined in fastai/tabular/data.py. fastai automatically takes care of that. You can also override it.

def emb_sz_rule(n_cat:int)->int: return min(600, round(1.6 * n_cat**0.56))
1 Like

@farid

Got it. Thanks!!