Using ColumnarModelData.from_data_frame for classification

I am trying to use the ColumnarModelData.from_data_frame for classification purpose. I looked into the code but not sure how to pass dependent variable as one-hot encoded or categorical variable directly.

I was able to get the model working when I passed the binary dependent variable with categorical codes with value 0 & 1. However is seems to be building a regression model.

Data type of dependent variable is int8 after processing categorical variable through following code.
dp_input[dep] = dp_input[dep].cat.codes

Appreciate any inputs on the same. I am getting following as model summary:

<bound method Learner.summary of MixedInputModel (
(embs): ModuleList (
(0): Embedding(5, 3)
(1): Embedding(4, 2)
(2): Embedding(33, 10)
(3): Embedding(10, 5)
(4): Embedding(33, 10)
(5): Embedding(8, 4)
(6): Embedding(4, 2)
(7): Embedding(12, 6)
(8): Embedding(14, 7)
(9): Embedding(3, 2)
(10): Embedding(13, 7)
(11): Embedding(5, 3)
(12): Embedding(25, 10)
(13): Embedding(2827, 10)
(14): Embedding(578, 10)
(15): Embedding(1654, 10)
(16): Embedding(5178, 10)
(17): Embedding(132, 10)
(18): Embedding(143, 10)
(19): Embedding(125, 10)
(20): Embedding(14, 7)
(21): Embedding(1657, 10)
(22): Embedding(64, 10)
(23): Embedding(17, 9)
(24): Embedding(13, 7)
(25): Embedding(54, 10)
(26): Embedding(15, 8)
(27): Embedding(13, 7)
(28): Embedding(54, 10)
)
(lins): ModuleList (
(0): Linear (227 -> 250)
(1): Linear (250 -> 100)
)
(bns): ModuleList (
(0): BatchNorm1d(250, eps=1e-05, momentum=0.1, affine=True)
(1): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True)
)
(outp): Linear (100 -> 1)
(emb_drop): Dropout (p = 0.04)
(drops): ModuleList (
(0): Dropout (p = 0.001)
(1): Dropout (p = 0.01)
)
(bn): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True)
)>

You don’t need to one-hot encode the dependent variable in PyTorch. Can you put your code in gist.github.com or github repo?

If this is your own dataset, please also include a sample in the gist or do a df.head() so that we can see the data.

Thanks Ramesh. Do you have sample code for classification task with embeddings which I can use.

This might help Structured Learner