The more I fight with things, the more I figure stuff out.
Correct me if I’m wrong, but as to how to use torchtext to solve a multi-label classification problem, the solution is to define one torchtext.data.Field object for every label … and then use F.sigmoid as my final non-linearity and F.binary_cross_entropy as my loss function.
This means I can’t use the fast.aiTextData or TextDataLoader objects as is (although they were instructive as to what I needed to do for the multi-label dataset).
Now my question is, how to convert these values to probabilities?
@wgpubs
Thank you so much for sharing the code to create multi label data in TorchText
I am trying to solve the Toxic comment challenge on kaggle.
As per my understanding, binary_cross_entropy should be used as the criterion in RNN_Learner (Please correct me if I am wrong). When I use the binary_cross_entropy as criterion, fit model throws an error.
I actually came across the same issue. I know it is due to the negative numbers in the output tensor because the error goes away if I take the absolute value of the output like this in model.py:
def step(self, xs, y):
xtra = []
output = self.m(*xs)
if isinstance(output,(tuple,list)): output,*xtra = output
self.opt.zero_grad()
output = output.abs() #<<------ This thing. BUT DON'T DO THIS
print(output)
But of course we do NOT want to do that. So I’ll investigate further and will let you know.
Not sure if this is only a result of negative values. For example I’m using ImageClassifier.from_csv and is_multi is false so indeed the outputs have negative values but the last layer is LogSoftmax hence only positives should be returned, further the loss is nll_loss, yet I get the exact same error.
Yep you’re right that’s indeed the case. So how did you go about solving it, if I just add a sigmoid on top after the last logSoftmax activation, things seemingly work but my losses are negative and learning rate finder stops working.
Is there a reason you are using LogSoftmax over Softmax? Softmax is good for choosing one thing at the end. Sigmoid allows multiple activations to be big. So, if you are okay with picking one thing at the end, I’d replace LogSoftmax with Softmax. That’ll make everything between 0 and 1. If you want to choose multiple things, I’d consider using Sigmoid instead of LogSoftmax.
Thanks for the feedback. I’m just using the default pre trained resnet50, which puts a logsoftmax at the end for single class classification. This one is for the humpback whales, which have about 4000 classes. The default last layer is
things seem to break, lr_find plots an empty chart and the values look out of line: 9%|▉ | 12/131 [00:01<00:15, 7.62it/s, loss=-0.000223]
Still can’t seem to find my way around this one.
Thanks! I spent days on debugging as the error message doesn’t give much useful information, and the whole kernel crash after a device-side assert trigger and I have to re-run the whole thing…
I am slightly confused here, what was the output of the language model? The default criterion is cross_entropy, therefore, I expect I can just switch cross_entropy to binary_cross_entropy without changing other things of the model and change the target as a one-hot target.
According to PyTorch Documentation, F.cross_entropy combine log_softmax and nll_loss. On the other side, F.binary_cross_entropy is expecting an input after sigmoid layer? Does that mean cross_entropy is actually 2 layers that you can applied after a linear layer, while binaray_cross_entropy should be applied after a sigmoid layer?
So I just notice that, if I do it in PyTorch 0.4.0, it will throws NaN instead of device-assert error