MixedInputModel does sigmoid when is_multi=True

yesh · July 9, 2018, 5:48pm

I’m not sure if it’s a bug, but when I checked the source-code of MixedInputModel, in forward method we have

        if not self.is_reg:
            if self.is_multi:
                x = F.sigmoid(x)
            else:
                x = F.log_softmax(x)
        elif self.y_range:
            x = F.sigmoid(x)
            x = x*(self.y_range[1] - self.y_range[0])
            x = x+self.y_range[0]
        return x

I can’t help but feel but F.sigmoid should be computed when self.is_multi is False and not True

Is my understanding correct in this?

Patrick · July 9, 2018, 8:35pm

I think the nomenclature is a bit overloaded here. I believe is_multi refers to situations where an observation can be categorized as multiple things, and not referring to multi-class classification.

Where things get a little confusing for me is I’ve always used sigmoid for binary classification which I know is equivalent to softmax when there are only two classes, and so I assume - given how softmax is employed here - that it will work for a target variable that is just a tensor of 0s and 1s but I haven’t tried.

yesh · July 10, 2018, 7:23pm

But I don’t see how that helps in using sigmoid as the final activation.
Also, the code by itself doesn’t distinguish between 2 or more, and the if not self.is_reg: ... code seems to fit perfectly for a final layer activation…

Patrick · July 10, 2018, 11:28pm

Sigmoid is an appropriate activation to use when predicting the presence or absence of multiple things. Each potential thing can either be present or not and thus is binary. An inappropriate activate would be softmax since it wants to predict the presence of one and only one thing for a given observation.

The way I read the code is…

If not regression:
    If more than 1 class could be present for an observation:
        Use sigmoid
    Else only one class per observation:
        Use softmax

Else If regression and if a range is provided:
     Constrain the predictions be within the max and min of y_range

Otherwise, end with a linear activation

yesh · July 11, 2018, 6:20am

Thank you!! Now I understood