I’m not sure if it’s a bug, but when I checked the source-code of MixedInputModel, in forward method we have
if not self.is_reg:
if self.is_multi:
x = F.sigmoid(x)
else:
x = F.log_softmax(x)
elif self.y_range:
x = F.sigmoid(x)
x = x*(self.y_range[1] - self.y_range[0])
x = x+self.y_range[0]
return x
I can’t help but feel but F.sigmoid should be computed when self.is_multi is False and not True
I think the nomenclature is a bit overloaded here. I believe is_multi refers to situations where an observation can be categorized as multiple things, and not referring to multi-class classification.
Where things get a little confusing for me is I’ve always used sigmoid for binary classification which I know is equivalent to softmax when there are only two classes, and so I assume - given how softmax is employed here - that it will work for a target variable that is just a tensor of 0s and 1s but I haven’t tried.
But I don’t see how that helps in using sigmoid as the final activation.
Also, the code by itself doesn’t distinguish between 2 or more, and the if not self.is_reg: ... code seems to fit perfectly for a final layer activation…
Sigmoid is an appropriate activation to use when predicting the presence or absence of multiple things. Each potential thing can either be present or not and thus is binary. An inappropriate activate would be softmax since it wants to predict the presence of one and only one thing for a given observation.
The way I read the code is…
If not regression:
If more than 1 class could be present for an observation:
Use sigmoid
Else only one class per observation:
Use softmax
Else If regression and if a range is provided:
Constrain the predictions be within the max and min of y_range
Otherwise, end with a linear activation