I was adventuring myself in creating a classifier with two classes, a positive and a negative one (e.g. cat vs not cat). Here is what I did:
I created a databunch with two classes.
I created a model using cnn_learner(...) specifying the resnet34 arch, which created a model with a head that ended in a 512 x 2 linear layer (because there are 2 classes in the databunch)
I replaced the 512 x 2 linear layer with a 512 x 1 linear layer and a Flatten layer
I replaced the default loss func (CrossEntropy) with the BCEWithLogitsLoss loss func.
Used the accuracy_thresh metric for accuracy
Does this new model sounds reasonable? Should I approach it differently?
Asking since I am a newbie and I wanted to know if the changes I made sound reasonable or if they for some reason, don’t make sense. The model behaved well, so I guess I am not that far from the ideal solution.
I think this approach is fine by itself. The model generally won’t get any better or worse because of the changes you made. The loss function is still meaningfully the same for all we know. Make sure you tune the threshold though.
Thanks for your feedback! I thought it was going to be a bit better since I replaced the softmax layer with a binary/sigmoid layer. I think something like this is what it was mentioned in lesson 10 of 2019 course part 2.