TabularList: Training problems between CategoryList and Floatlist

mindtrinket · March 10, 2019, 11:23pm

I have been struggling for the last two days with training a tabular model to output a set of predictions of yes or no (1 or 0). I don’t think I am using them correctly.

When trying to use CategoryList
My dep_var is type category and I can also use AUCROC. My loss is cross entropy like I want.

While the AUC score looks like it will improve for the validation set, it doesn’t improve much and scores below .5 on the test set suggesting there is something really wrong. Additionally, the loss stays rather high.

Screenshots below.

54%20PM

When trying to use Floatlist
My dep_var can be either type category or float16 without issues. However, The AUCROC function fails. The loss function becomes exp_rmspe (like rossmann) but I am not trying to calculate a cont (like sales) and feel this is the wrong function for what I am trying to predict.

The loss function does better (although its not calculating the right thing), and the AUCROC for my test predictions is much better than when I use CategoryList.

Screenshots below.

The big question
Am I implementing the wrong code and/or misunderstanding using the loss functions?

In the meantime, I plan to use floatlist as I work on different data features.