I have a question regarding “unsupervised learning defined as fake tasks of supervised learning” which is strictly related to embeddings each categorical feature individually.
When you train an auto-encoder you have a very clear task which is reconstructing the original data thus it is supposed to be a good example of unsupervised learning without fake tasks. Augmentation is used for generalizing better e.g. denoising auto-encoders. Also the restriction of having smaller intermediate layers can be solved by using sparse auto-encoders or variational ones.
I see one advantage of auto-encoders is that we can have the middle layer (the code) as embedding of the whole datum. While it is obviously not efficient during inference because you need to feed-forward the first half of the network, it is a way to embed all of the numerical and categorical features in a single vector. This result would not be possible with standard entity embedding without going through one embedding layer for each feature before to mix them together? Wouldn’t it?
What advise do you have for using auto-encoders for embedding with respect to the matrix techniques taught in the course?
Thank you very much.