Entity embeddings

johannesstutz · November 16, 2020, 11:54am

Hi everyone, I have three questions regarding embeddings.

1st question
Chapter 8 describes the PyTorch Embeddings layer this way:

it indexes into a vector using an integer, but has its derivative calculated in such a way that it is identical to what it would have been if it had done a matrix multiplication with a one-hot-encoded vector

As we create our own embedding layer in the chapter, the embedding is literally just an array lookup. We don’t take any measures to make sure the derivative is calculated correctly. Why does it still work?

2nd question
I’d like to use entity embeddings to combine them with random forests, as is suggested in further research of chapter 9.

I tried to apply entity embeddings to the bulldozer dataset, see my question here: Lesson 7 official topic. It’s based on this notebook, all the way to the bottom. I don’t understand where the dimensions of the embedding layers come from. They are similar, but not identical to the unique levels of the columns.

3rd question
The book mentions an online-only chapter that replicates the Rossmann entity embeddings paper, has this been published yet?

Thank you for any ideas!

wpan · December 21, 2020, 10:41pm

I write a blog post about how to do entity embeddings in fastai v2. Hope this helps with some of your questions. If there are any mistakes you notice or suggestions you have, please let me know. Thanks!

muellerzr · December 21, 2020, 11:15pm

Is there a particular reason why you extracted the layers vs using hooks to grab the output at the embedding layer? For more context:

(I guess one is more raw PyTorch while the other is more fastai like.)

wpan · December 21, 2020, 11:47pm

I am not aware of this method with hooks before. Thanks for the suggestions!

johannesstutz · December 22, 2020, 10:21pm

Thank you @wpan, I’ll look into it after the holidays!