CNN Classifier is using image background to make decisions

Hi all, I followed the code from Lesson 1 to build a CNN image classifier for male vs female face. When I plot out the heatmaps, I see this:

Screenshot%20from%202019-07-29%2017-21-04

Screenshot%20from%202019-07-29%2017-21-45

The model seems to be using the background for classification; not the face! Even though it achieves pretty low error rate (around 15%) but it’s not making the decision on the correct part of the image.

Is there anything I can do to make the model use the actual faces instead of the image backgrounds?

Thanks

How long did you train for? Were you overfitting during training at all?

Otherwise look at the dataset itself. If each face type has a similar background, that may be associated with why

Ah yes, looks like I am! Here’s the last 10 epochs I trained on (almost looks like it found a new local min…)

epoch train_loss valid_loss error_rate time
0 0.399814 0.435781 0.169643 00:12
1 0.397863 0.481480 0.205357 00:11
2 0.388577 0.498393 0.205357 00:12
3 0.376468 0.499274 0.205357 00:11
4 0.368870 0.487290 0.160714 00:12
5 0.361315 0.477878 0.151786 00:12
6 0.351336 0.476605 0.151786 00:12
7 0.352189 0.472741 0.160714 00:11
8 0.346184 0.465146 0.151786 00:12
9 0.338350 0.464337 0.151786 00:13

And it looks like before the model overfitted, it does seem to use more of the face, like this:
Screenshot%20from%202019-07-29%2018-05-40

So I guess adding more regularization might help

That seems to look okay to me. How many images is the dataset? And what augmentations are you using?

374 male faces, 173 female faces (I used ai_utilities to download images)

I haven’t used any augmentations yet. You mean like, shift, rotate, etc?

Correct. (Transforms) I’d recommend try using those and seeing if it helps.