Multilabel text classification interpretation

clare · April 23, 2020, 10:14am

Greetings mortals

[I’m using fastai v1.0.60]

I’m attempting to evaluate/interpret my trained multilabel text classification model. Unfortunately the ClassificationInterprerter doesn’t seem to be set up with multilabels in mind, so struggling a bit to check what my model is doing. So far:

I’ve come across plot_multi_losses - but this only seems to work with image data
ClassificationInterpreter will spit out predictions as a list of class numbers instead of a list one-hot encoded tensors, so seems like I can’t really use interp at the moment.
I have manually extracted the model preds as one-hot encoded tensors (new_preds):

preds,y,losses = learn.get_preds(with_loss=True)

new_preds = (preds>0.7).type(ByteTensor)

Where 0.7 is my threshold and y is the list of one-hot encoded true labels.

Question is, what do I do with this? Does anyone have any experience interpreting multi-label models?
I’m struggling to visualize how a confusion matrix would work/be calculated in this context.

Thanks!

clare · May 14, 2020, 2:03pm

I’ve managed to get round to looking at this again, and thought I’d share my findings for anyone else struggling with this…

Not in fastai, but sklearn can handle multilabel classes and handles tensors no problem. I used sklearn.metrics.mutilabel_confusion_matrix and also found that sklearn.metrics.classification_report handled one-hot encoded multilabels (as my original post) natively.