[Lesson 1] Beat Google Auto ML at B747 vs A380


As an exercise I build up a Notebook to classify Boeing 747 vs Airbus A380 :

Here you will find the dataset : (3 Gb build from © Google Images, please don’t share it outside of the course)

My goal is to beat Google AutoML who achieve an accuracy of 94% : https://github.com/trancept/deep_learning_tests/blob/master/010-GoogleAutoML.ipynb

It’s based on a Notebook I did for V2 of the course to test different improvement solution.
I more than achieve it with previous version with an accuracy of 98% : https://www.linkedin.com/pulse/how-beat-google-automl-image-classification-benoit-courty/

But V3 is different so the training perform differently. For now it’s worse, I’ve to learn more about the new API.
For example this plot is weird:
(Edit from Jeremy - turns out the plot isn’t weird; see below for details).

Let me know if you find this useful and like to work with me to improve it : Let’s fight Google together :smile:


Cool project. I’ve never seen an lr finder plot that looks like that before…

I have seen similar lr finder plot. For a moment, I thought ‘Mars collided into the Earth’ :open_mouth:


How can it have same values for a particular value on z-axis?

@ecdrid think of a scatter plot but with connected points in the order that they occur (the index order). You can try this out for yourself with a numpy array
plt.plot([0,1,-1,2,3] , [1,2,3,4,0])
plt.scatter([0,1,-1,2,3] , [1,2,3,4,0])

Yep !
Really a weird plot

That plot is indeed very weird, @ecdrid. You seems to be on to something. Pretty unlikely on the lrfinder() there are two logloss values for the same learning-rate, unless the lrfinder() is also doing a one-cycle-policy thing - the learning rate goes up then comes down

1 Like

Thank you all for commenting, in fact it is not the graph of the lr_finder()
It is the graph after a learn.fit_one_cycle(cyc_len=epoch, max_lr=lr)
So it seems legit considering what 1cycle policy does : https://sgugger.github.io/the-1cycle-policy.html#the-1cycle-policy
Sorry to have warned you for nothing !

1 Like

oh I see :grinning::grinning::grinning:

It’s my fault, there is no point a ploting this graph, learn.recorder.plot_losses() is useful in that context, not learn.recorder.plot().

it’s a great notebook, with a great google-beating result :+1::+1::+1:,and that plot is but a distracting side issue. I was just worried there’s a bug, and my future self might be seeing such a plot and be stunned.

@Benoit_c I added a note to your top post to clarify - hope it’s OK. Feel free to edit/remove as you wish.

I saw that both batch_size and image_size was redefine in a loop. How does the model weight react to the change of input size? I know there is an adaptive pooling layer at the end to guarantee the output size. Does the model weight simply upsample/downsample when we change the image input size?

for bs, sz, epoch in training_loop:
    data.batch_size = bs
    learn.fit(epochs=epoch, lr=lr)

Actually I may have missed it. When you said multiple images sizes, I was thinking how the model weight can adapt to take different input size. Then I notice the sz parameter was actually never used? I think you only change the batchsize.

training_loop = [
    [123, 64, 10],
    [150, 128, 10],
    [123, 224, 10],
for bs, sz, epoch in training_loop:
    data.batch_size = bs
    learn.fit(epochs=epoch, lr=lr)
1 Like

You’re totally right, I forgot to allocate the size !
For your question about weights, it works because in convolution layer we do not weight the pixels, but the convolution mask, who did not change with the input size.
It’s a good question because with older network we need to use fixed input sizes.
I will update my code like that :

for bs, image_size, epoch in training_loop:
data.size = image_size
data.batch_size = bs
learn.fit(epochs=epoch, lr=learning_rate)

1 Like

You are right, I am not thinking carefully about the conv layer. Did you get better result after resizing? :slight_smile:

Not at all, it’s worse :frowning: : https://github.com/trancept/deep_learning_tests/blob/master/011-Binary_Classification_747_vs_A380-full.ipynb
Previous version : https://github.com/trancept/deep_learning_tests/blob/86e8e159cfbeae6ddb7c405f8158bd35a3c80226/011-Binary_Classification_747_vs_A380-full.ipynb

I just came across this thread since I was working on a similar problem. In my case, however, I’m trying a 5-class multiclass classification (Boeing 747, Boeing 777, Airbus A340, Airbus A350, Airbus A380), after having downloaded all of these from Google Images. The problem I’m facing - since the images get square cropped, a lot of the times, the main features of the airplane get cropped off, since most images are lateral. (as opposed to dogs vs. cats or dog breeds, where the majority of the features lie in the center of the image). From the V2 class, I remember Jeremy mentioning that padding with white space does not improve the model significantly. Does anyone have an idea as to how to overcome this?

This image of a 747, for example


I would be curious to know the approximate error rate you are getting now. In my opinion, the main features of these class of aircraft would be the nacelles and the profile of the airframe; and perhaps these features could still be picked up regardless of the cropping?

I’ve been tinkering with datasets of smaller planes (e.g. Cessna, Piper), and was surprised to see very low error rates (~2-4%) with almost no data cleaning.

I’m currently getting an error rate of around 33%. I believe that the nose, and the exterior of the cockpit area adds good signal to the model. Cessnas and pipers are relatively lower in length, and have distinguishing features to separate them. For these bigger planes, the nacelles do provide good information, but I think the nose area in general make them more distinguishable. To be fair, I have 50 images of each, so 40 goes into training and 10 goes into validation. I should also probably find more images to add to my dataset. I’ll keep you updated on how I progress. I should probably add Cessnas and other smaller aircrafts too to my dataset. How many images are you using?