[Lesson 1] Beat Google Auto ML at B747 vs A380

Benoit_c · October 23, 2018, 9:02pm

Hello,

As an exercise I build up a Notebook to classify Boeing 747 vs Airbus A380 :

trancept/deep_learning_tests/blob/master/011-Binary_Classification_747_vs_A380-full.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Binary Classification of 747 vs A380\n",
    "The goal is to classify a picture between two classes : an Boeing 747 or an Airbus A380.\n",
    "*Spoiler :* This source code is 3 times better than Google AutoML on the same dataset !\n",
    "\n",
    "Note : this Notebook was writen with old version  of fastai librairy and updated fot the new version.\n",
    "For now, the old version (https://github.com/trancept/deep_learning_tests/blob/master/012-Binary_Classification_747_vs_A380-essential.ipynb) perform much better (98% accuracy vs 95%), I've to figure out why.\n",
    "\n",
    "You may notice a strange line :\n",
    "\n",
    " !rm -r {PATH}tmp\n",
    "\n",
    "It's there to remove temporary files to avoid messing out between different training.\n",
    "\n",
    "## Dataset\n",

This file has been truncated. show original

Here you will find the dataset : http://52.167.231.0/datasets/boeing_vs_airbus.zip (3 Gb build from © Google Images, please don’t share it outside of the course)

My goal is to beat Google AutoML who achieve an accuracy of 94% : https://github.com/trancept/deep_learning_tests/blob/master/010-GoogleAutoML.ipynb

It’s based on a Notebook I did for V2 of the course to test different improvement solution.
I more than achieve it with previous version with an accuracy of 98% : https://www.linkedin.com/pulse/how-beat-google-automl-image-classification-benoit-courty/

But V3 is different so the training perform differently. For now it’s worse, I’ve to learn more about the new API.
For example this plot is weird:
learn.recorder.plot()
16-weird-plot
(Edit from Jeremy - turns out the plot isn’t weird; see below for details).

Let me know if you find this useful and like to work with me to improve it : Let’s fight Google together

jeremy · October 23, 2018, 9:51pm

Cool project. I’ve never seen an lr finder plot that looks like that before…

cedric · October 24, 2018, 3:08pm

I have seen similar lr finder plot. For a moment, I thought ‘Mars collided into the Earth’

ecdrid · October 25, 2018, 12:18am

How can it have same values for a particular value on z-axis?

danield · October 25, 2018, 12:41am

@ecdrid think of a scatter plot but with connected points in the order that they occur (the index order). You can try this out for yourself with a numpy array
e.g.
plt.plot([0,1,-1,2,3] , [1,2,3,4,0])
vs
plt.scatter([0,1,-1,2,3] , [1,2,3,4,0])

ecdrid · October 25, 2018, 12:59am

Yep !
Thanks
Really a weird plot

wyquek · October 25, 2018, 2:29am

That plot is indeed very weird, @ecdrid. You seems to be on to something. Pretty unlikely on the lrfinder() there are two logloss values for the same learning-rate, unless the lrfinder() is also doing a one-cycle-policy thing - the learning rate goes up then comes down

Benoit_c · October 25, 2018, 12:20pm

Thank you all for commenting, in fact it is not the graph of the lr_finder()
It is the graph after a learn.fit_one_cycle(cyc_len=epoch, max_lr=lr)
So it seems legit considering what 1cycle policy does : https://sgugger.github.io/the-1cycle-policy.html#the-1cycle-policy
Sorry to have warned you for nothing !

wyquek · October 25, 2018, 12:58pm

oh I see

Benoit_c · October 25, 2018, 1:49pm

It’s my fault, there is no point a ploting this graph, learn.recorder.plot_losses() is useful in that context, not learn.recorder.plot().

wyquek · October 25, 2018, 2:17pm

it’s a great notebook, with a great google-beating result ,and that plot is but a distracting side issue. I was just worried there’s a bug, and my future self might be seeing such a plot and be stunned.

jeremy · October 25, 2018, 2:47pm

@Benoit_c I added a note to your top post to clarify - hope it’s OK. Feel free to edit/remove as you wish.

nok · October 27, 2018, 9:01am

I saw that both batch_size and image_size was redefine in a loop. How does the model weight react to the change of input size? I know there is an adaptive pooling layer at the end to guarantee the output size. Does the model weight simply upsample/downsample when we change the image input size?

for bs, sz, epoch in training_loop:
    data.batch_size = bs
    learn.fit(epochs=epoch, lr=lr)

nok · October 28, 2018, 7:57am

Actually I may have missed it. When you said multiple images sizes, I was thinking how the model weight can adapt to take different input size. Then I notice the sz parameter was actually never used? I think you only change the batchsize.

training_loop = [
    [123, 64, 10],
    [150, 128, 10],
    [123, 224, 10],
]
for bs, sz, epoch in training_loop:
    data.batch_size = bs
    learn.fit(epochs=epoch, lr=lr)

Benoit_c · October 28, 2018, 11:11am

You’re totally right, I forgot to allocate the size !
For your question about weights, it works because in convolution layer we do not weight the pixels, but the convolution mask, who did not change with the input size.
It’s a good question because with older network we need to use fixed input sizes.
I will update my code like that :

for bs, image_size, epoch in training_loop:
data.size = image_size
data.batch_size = bs
learn.fit(epochs=epoch, lr=learning_rate)

nok · October 28, 2018, 2:44pm

You are right, I am not thinking carefully about the conv layer. Did you get better result after resizing?

Benoit_c · October 28, 2018, 6:12pm

Not at all, it’s worse : https://github.com/trancept/deep_learning_tests/blob/master/011-Binary_Classification_747_vs_A380-full.ipynb
Previous version : https://github.com/trancept/deep_learning_tests/blob/86e8e159cfbeae6ddb7c405f8158bd35a3c80226/011-Binary_Classification_747_vs_A380-full.ipynb

surag · November 2, 2018, 4:32am

I just came across this thread since I was working on a similar problem. In my case, however, I’m trying a 5-class multiclass classification (Boeing 747, Boeing 777, Airbus A340, Airbus A350, Airbus A380), after having downloaded all of these from Google Images. The problem I’m facing - since the images get square cropped, a lot of the times, the main features of the airplane get cropped off, since most images are lateral. (as opposed to dogs vs. cats or dog breeds, where the majority of the features lie in the center of the image). From the V2 class, I remember Jeremy mentioning that padding with white space does not improve the model significantly. Does anyone have an idea as to how to overcome this?

This image of a 747, for example

Thanks

rbhatta · November 2, 2018, 5:34am

I would be curious to know the approximate error rate you are getting now. In my opinion, the main features of these class of aircraft would be the nacelles and the profile of the airframe; and perhaps these features could still be picked up regardless of the cropping?

I’ve been tinkering with datasets of smaller planes (e.g. Cessna, Piper), and was surprised to see very low error rates (~2-4%) with almost no data cleaning.

surag · November 2, 2018, 6:06am

I’m currently getting an error rate of around 33%. I believe that the nose, and the exterior of the cockpit area adds good signal to the model. Cessnas and pipers are relatively lower in length, and have distinguishing features to separate them. For these bigger planes, the nacelles do provide good information, but I think the nose area in general make them more distinguishable. To be fair, I have 50 images of each, so 40 goes into training and 10 goes into validation. I should also probably find more images to add to my dataset. I’ll keep you updated on how I progress. I should probably add Cessnas and other smaller aircrafts too to my dataset. How many images are you using?