I am trying to use the ColumnarModelData.from_data_frame for classification purpose. I looked into the code but not sure how to pass dependent variable as one-hot encoded or categorical variable directly.

I was able to get the model working when I passed the binary dependent variable with categorical codes with value 0 & 1. However is seems to be building a regression model.

Data type of dependent variable is int8 after processing categorical variable through following code.

dp_input[dep] = dp_input[dep].cat.codes

Appreciate any inputs on the same. I am getting following as model summary:

<bound method Learner.summary of MixedInputModel (

(embs): ModuleList (

(0): Embedding(5, 3)

(1): Embedding(4, 2)

(2): Embedding(33, 10)

(3): Embedding(10, 5)

(4): Embedding(33, 10)

(5): Embedding(8, 4)

(6): Embedding(4, 2)

(7): Embedding(12, 6)

(8): Embedding(14, 7)

(9): Embedding(3, 2)

(10): Embedding(13, 7)

(11): Embedding(5, 3)

(12): Embedding(25, 10)

(13): Embedding(2827, 10)

(14): Embedding(578, 10)

(15): Embedding(1654, 10)

(16): Embedding(5178, 10)

(17): Embedding(132, 10)

(18): Embedding(143, 10)

(19): Embedding(125, 10)

(20): Embedding(14, 7)

(21): Embedding(1657, 10)

(22): Embedding(64, 10)

(23): Embedding(17, 9)

(24): Embedding(13, 7)

(25): Embedding(54, 10)

(26): Embedding(15, 8)

(27): Embedding(13, 7)

(28): Embedding(54, 10)

)

(lins): ModuleList (

(0): Linear (227 -> 250)

(1): Linear (250 -> 100)

)

(bns): ModuleList (

(0): BatchNorm1d(250, eps=1e-05, momentum=0.1, affine=True)

(1): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True)

)

(outp): Linear (100 -> 1)

(emb_drop): Dropout (p = 0.04)

(drops): ModuleList (

(0): Dropout (p = 0.001)

(1): Dropout (p = 0.01)

)

(bn): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True)

)>