Autoencoder for tabular data

Let’s say I have some tabular data and I want to use a denoising autoencoder to generate good features for a downstream neural network. A bit like the solution in this kaggle post: https://www.kaggle.com/c/petfinder-adoption-prediction/discussion/88740

I understand how to train an autoencoder with data that is only continuous, but what do you do if you have categorical variables mixed with continuous variables in your input data. For continuous data, you simply have to have MSE loss to compare how close each of your variables are from the input variables. But for categorical variables, do you have to use cross-entropy for each of the categorical variables and somehow blend the losses for the continuous and categorical variables?

Thanks,

1 Like