Transform categorical variables into a continuous representation

I have a dataset with categorical variables, which I need to cluster. Most clustering algorithms work on continuous variables. My idea was to transform the categoricals into a continuous representation by means of a variational auto-encoder. The encoder part of the VAE consists of a tabular model (embeddings) followed by a few linear layers, and a latent-z layer. The continuous representation would be the outcome of the inference on the dataset by the encoder.
Does this make sense at all?
Thank you.