How to get Embeddings without Neural Net? Lesson 14

There’s an interesting moment in Lesson 14 saying that:

  • Training Neural Network could take hours, while training Gradient Boosted Trees takes seconds, for the same task.
  • And, that Gradient Boosted Trees have almost the same accuracy as Neural Network if used with proper Embeddings.

The video https://youtu.be/1-NYPQw5THU?t=12m50s

But there’s one thing I don’t understand - how to get Embeddings without Neural Network?

My understanding is that you use Neural Network with Embedding Layer first. And then you get those trained embeddings and reuse it with Gradient Boosted Trees.

But it kinda defeats the whole point of using Gradient Boosted Trees as a more performant and lightweight technic - because you can’t speed up the training to seconds. You still have to build Nerual Net and spent hours training Embeddings on Neural Net first.

Seems like I missing some important point…

P.S. Lesson 14 is awesome! Learned tons of useful tricks and approaches. Huge thanks!

The key takeaway from Guo and Berkhahn’s paper is summarized in the abstract:

“We also demonstrate that the embeddings obtained from the trained neural network boost the performance of all tested machine learning methods considerably when used as the input features instead”

One important thing to note - entity embeddings of this type have not been widely explored, so while in this case the gradient boosted trees did not perform as well as the neural network, it is possible that for some problems, using entity embeddings + gradient boosted trees would lead to a better result than entity embeddings + NN.

I wonder if you would be able to super speed up inference for a model by generating NN entity embeddings and using entity embeddings + gradient boosted trees ? probably only certain architectures. Just thinking out loud :slight_smile:

I’m not sure I understand your question. Entity Embedding is just assigning categorical values to entities or group of entities. Where we choose to do it by hand or other ML technique, there’s no necessity for NN. e.g. assign ranges for low , med, high.

If these embeddings are chosen to help the algorithm recognise useful patterns/effects, then they improve that algorithms ability.

It looks like you’ve jumped in to fastai at the end of last years part 2. There’s a lot of information on entity embeddings through the 2018 part 1 and machine learning materials (and maybe more in the 2018 part 2 about to start).

does this help (written by a student I believe) https://towardsdatascience.com/structured-deep-learning-b8ca4138b848

2 Likes

Do you have any examples of such a case? Either a paper or your own experience? I haven’t looked around for such a case but have not really found it. I would really appreciate if you could point me to an example / paper or mention your own experience. That would help me a lot in my current project that deals with tabular data. There are so many options to try out to fine-tune the model but never enough time!