Hey,
Not sure if this is a bug. But there is something with the order of the class lables in text_classifier_learner.data.classes
that doesn’t make sense to me. The numerical value of the target class (obtained from text_classifier_learner.get_preds(DatasetType.Valid)
) does not map to the list index in text_classifier_learner.data.classes
This list is also used in class ClassificationInterpretation
to label the axis of the confusion matrix and order is important here.
Some code to explain the issue:
# get predictions for validation set
preds = learn.get_preds(DatasetType.Valid)
# get numerical target labels
ytrue = to_np(preds[1])
for i in range(2):
# string value of target label in data.valid_ds
label_ds = learn.data.valid_ds[i][1].obj
# numerical value of target class from learn.get_preds(DatasetType.Valid)
label_int = ytrue[i]
# try to get label string value by mapping interger label to list index in learn.data.classes
label_string = learn.data.classes[label_int]
print(label_ds == label_string)
>>> False
>>> False
Shouldn’t label_ds
and label_string
exactly map??