Structured data with only numerical features

raaj · October 24, 2018, 6:00am

If I have a structured data with only numerical features of very few features (say 7-8), how do I proceed with various deep learning approaches. I understood we should try to tackle the categorical variables through embeddings. But, if we have only numerical values, do we just pass in through a CNN or RNN model? Can something else be done too. How do I proceed with a CNN and a RNN approach? Any links to blogs etc. will be helpful.

Antoine · October 24, 2018, 6:41am

Hi Raaj,

first of all, could you please describe the problem you’re trying to solve? And what exactly are the inputs and outputs of the model?

Most of the time, as taught by Jeremy, the preferred approach to dealing with structured data using neural nets is with a few fully connected (FC) layers, not with CNN or RNN architectures.

Also, if you don’t have categorical features, then it simply means you won’t have embedding matrices and all your inputs will feed directly to the first FC layer.

Have you already checked the machine learning course? If not I would strongly suggest you do because there you will acquire deep foundations regarding the use of tree ensembles and neural nets for structured data.

raaj · October 24, 2018, 10:55am

Thank you so much for reply Antoine.

The problem is in Land Classification I guess. Actually the classes are all numerical and I have no knowledge of what they refer to. They are labelled as X,Y,Z. It is an assignment where I was asked to use CNN and RNN beyond MLPs. Hence with all of them having numerical data and no categorical data to find embedding matrices for, I could not figure out a way to do convolutions except 1d-convolutions. I guess MLPs work better in such cases. But since I am not sure how the data is arranged, is there any architectures I could just try?

Antoine · October 24, 2018, 6:23pm

Raaj,

I don’t know if I can help you further, I’m sorry. Perhaps someone more experienced can help.

I have never seen examples of CNNs, other than 1d-convolutions, that deal with a vector of input features. Also I don’t understand why having categorical features would help, since ultimately they will be converted, after the lookup into the embedding matrices, into a vector which will be appended to your vector of numerical features that then feeds into the first hidden layer.