A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

barnacl · February 21, 2020, 5:02pm

@mschmit5 Just to add to what Zachary said. we use different loss functions.
Cross entropy loss is used for the single label case. Softmax is the activation for the single label case. Softmax is great when you have exactly one (no more than one and definitely at least one) of the classes. Applying softmax here is just rescaling the raw output to 0-1 range. So you could in fact avoid softmax and make a prediction - ie the class of the highest raw output.

In the multilabel case the loss is binary cross entropy. You are checking for each class whether it is present or not based on a threshold. So i don’t think raw scores can be used here. You will need to convert it to the 0-1 range and then use the threshold.