Improving Binary Classification for Person Recognition

ouimet51 · June 9, 2019, 11:27pm

Hello!

I am currently working on a binary image classifier - the models goal is to identify any images where Donald Trump is present. I decided the best route was to have two categories non-trump and trump images.

The non-trump training data is just a set of random images, I attempted to add in some extra images of peoples faces (non-trump obviously) so it wouldn’t just learn to detect the difference between images with and without people/faces. The trump images are just a bunch of images with Donald Trump present.

I trained the model and it did pretty well (error rate = 0.05) but because it’s \ a binary classification model I feel like it can do better (< 0.01). When I look at the interp.plot_top_losses it seems like the model struggles the most when there is a non-trump image with a face in it (see image below… sorry Obama)

So my questions are as follows:

Is the way I have my training data set up the best way to go about feeding the data model or is there a better way?
If the way I have the data set up is a good way to configure the data how can I improve the model for faces specifically?

Happy to provide more information
thanks in adance

dipam7 · June 10, 2019, 11:34am

Hey, this type of problem is actually called one class classification. It’s a relatively difficult problem because the non-trump class can be practically anyone. I remember there are a few threads about it on the forum, but I don’t think I remember seeing any implementations. Check them out. Cheers

ouimet51 · June 14, 2019, 6:58pm

Hey, thanks for your response, just knowing the type of classification problem that I was trying to solve was very helpful.

I was able to find a few articles on one class classification and it seems like it’s a problem that no one has a great solve for just yet from a practical standpoint. I did find this article One Class Classifying — What kind of data set I should have? which provided some helpful insight.

Based on the findings from the article I decided to add in more training data that was ‘similar’ to the “One Class”. The easiest way to do that was to use the plot_top_losses function. I was able to sort out that the model was classifying many images fo faces as Trump.

I added in about 100 random face images to my training data set in the non-trump class and retrained my model (the non-trump data set was about 50% random images and 50% faces) this didn’t seem to do the trick so I took another look at what images it was struggling with using plot_top_losses and it was not just faces but faces of men who were wearing suits ( it particularly failed with red headed men in suits which cracks me up).

So I decided to dump in a bunch of images of men in suits to the training data and that seemed to improve the model greatly, going from a 6% error rate to a 2% error rate. I took a scroll through the photos I had of Donald Trump and noticed that he had a suit on in 95% of them, hence why the model was classifying men in suits as trump images.

With that knowledge I decided to beef up the training data for the trump class with photos of Donald with out his suit on, surprisingly not many of those exist unless he is golfing so now i’m worried it might classify golfers as well, but we will see!

The more I think about this type of one class classification, I realize is that it is difficult to predict what exactly the model will confuse for the “one class” - when put in production other things may come up that weren’t included in my validation set (maybe people with bad spray tans). If I do deploy this to production it would need lots of monitoring to see what it does classify as trump, it would be good to periodically retrain the model adding in new training data with confusing examples.

Sorry if this is a bit long winded I found the whole process pretty interesting and wanted to provide as much info as possible for someone who stumbles upon this in the future.

Lord_jayam · June 15, 2019, 4:49pm

Hey,
Are you sure your model is not falling into something like the Base Rate Fallacy?
Particularly it happens when you have too many examples of one class, and too little examples of another.
It is particulartly easy to fall into that fallacy here as there are a lot of non-trump images as compared to Trum images to choose from. So one possible reason your model could be behaving as such could be this.

Regards,

pravinvignesh · June 15, 2019, 5:05pm

Train the model with images containing face only.
Develop a model that can locate the face in an image and do the prediction based on that.It might improve the accuracy. Opencv lib will help u to do the face localization. Just give it a shot🤞

ouimet51 · June 17, 2019, 1:56am

Hey - I’m not 100% sure how that would apply here do you mind explaining a bit further?

ouimet51 · June 17, 2019, 1:56am

This is probably a wayyyyy better solve. I will give this a try in v2

Lord_jayam · June 17, 2019, 5:17am

Hey,
So I ran into this problem sometimes as well.
As I said it is really easy to find pictures of non-Trump people than Trump. So, if your data is skewed towards one example, then the model might find it more difficult to generalize to other data.
So, having Equal amount of data for both of these classes might help a little.

The other part was explained addressed by @pravinvignesh.
You might also have better success with training data on non-Trump images with suits, and make sure that you have enough examples of blond men in suits. But I guess you would have already figured that part out.

Regards,
Jayam