Tabular classification training issues and probability output with fastai v2

ajka · September 1, 2020, 5:10am

I have a training_dataframe of the following form:

X	Y	Z	A	C	E
0	1	0	1	0	0
1	1	0	0	0	1
1	1	1	0	1	0
0	0	1	0	1	0

features X, Y, Z are binary variables
labels A, B, C, D, E are the dependent variable (each row can be one of the possible labels)

I need to predict probabilities for the test dataset like so:

X	Y	Z	A	B	C	D	E
0	1	0	.76	.11	.01	.00	.12
1	1	0	.05	.06	.23	.05	.61
1	1	1	.10	.14	.54	.13	.09
0	0	1	.14	.15	.43	.11	.32

I’m following along with the 09_tabular.ipynb notebook from fastai/fastbook to learn how to train on a tabular dataset. I ran into the following issue:

The learning rate plot is blank, and when I train, the loss is nan.

My training_dataframe has no missing values (I double checked), and every cell in the dataframe is either 0 or 1, so it’s normalized. Is there something I’m doing wrong here?

Also, how would I configure the learner to output predictions in the desired form (with probabilities for each label, summing to 1)?

If anyone can point me in the right direction, I’d really appreciate it!

stefan-ai · September 1, 2020, 9:10am

Try changing the dependent variable from one-hot encoding into one target column with different levels A, B, C, D, E. The standard output of a classifier will be probabilities for each class that sum up to one.

Since you are solving a classification problem you need to use a loss function for classification, i.e. CrossEntropyLossFlat instead of F.mse_loss. You can also try not specifying any loss function. Usually fastai picks the right one for you automatically based on the structure of your data. Also, thre is no need to specify y_range for classification problems.

ajka · September 2, 2020, 4:08am

I changed it to a target column with labels, used CrossEntropyLossFlat as the loss function, and removed y_range. That did the trick. Thank you!

For anyone who comes across this in the future, this is a useful post: https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/

X	Y	Z	A	B	C	D	E
0	1	0	.76	.11	.01	.00	.12
1	1	0	.05	.06	.23	.05	.61
1	1	1	.10	.14	.54	.13	.09
0	0	1	.14	.15	.43	.11	.32

X	Y	Z	A	B	C	D	E
0	1	0	.76	.11	.01	.00	.12
1	1	0	.05	.06	.23	.05	.61
1	1	1	.10	.14	.54	.13	.09
0	0	1	.14	.15	.43	.11	.32

X	Y	Z	A	B	C	D	E
0	1	0	.76	.11	.01	.00	.12
1	1	0	.05	.06	.23	.05	.61
1	1	1	.10	.14	.54	.13	.09
0	0	1	.14	.15	.43	.11	.32