Implementing a CNN model for classification of negative vs positive class

Hello!

I was adventuring myself in creating a classifier with two classes, a positive and a negative one (e.g. cat vs not cat). Here is what I did:

  1. I created a databunch with two classes.
  2. I created a model using cnn_learner(...) specifying the resnet34 arch, which created a model with a head that ended in a 512 x 2 linear layer (because there are 2 classes in the databunch)
  3. I replaced the 512 x 2 linear layer with a 512 x 1 linear layer and a Flatten layer
  4. I replaced the default loss func (CrossEntropy) with the BCEWithLogitsLoss loss func.
  5. Used the accuracy_thresh metric for accuracy

Does this new model sounds reasonable? Should I approach it differently?

Asking since I am a newbie and I wanted to know if the changes I made sound reasonable or if they for some reason, don’t make sense. The model behaved well, so I guess I am not that far from the ideal solution.

Best regards,
Musa

I think this approach is fine by itself. The model generally won’t get any better or worse because of the changes you made. The loss function is still meaningfully the same for all we know. Make sure you tune the threshold though.

Thanks for your feedback! I thought it was going to be a bit better since I replaced the softmax layer with a binary/sigmoid layer. I think something like this is what it was mentioned in lesson 10 of 2019 course part 2.

Look at this this way:

  1. Your model outputs just 1 number representing the probability of the presence of a particular class. Let’s call that number ‘p’ (0<p<1)
  2. BCE Loss takes into account two numbers : ‘p’ and ‘1-p’. They both add up to 1. This is exactly what a softmax over 2 outputs would have done for you.

So in essence, you’re still doing kind of the same thing but you have slightly lesser number of parameters now.