Lesson 1 homework: airliners vs fighter jets

ericlin · January 10, 2019, 10:10am

Hi, I’d like to share my results

I downloaded a few tens of images of airliners and fighter jets and updated PATH.

The data dir looks like this:

jets/
    train/
        airliners
        fighter-jets
    test/
        airliners
        fighter-jets
    valid/
        airliners
        fighter-jets

Here are my results:

training-stats

Most%20incorrect%20airliners

It seems that the most-correct airliners are all side views in which the elongated fuselage is clearly visible. Views from the front or back are more likely to be incorrect.

For fighter jets, the most-correct ones all have a view slightly from the top or the bottom. ie, you get to see either the belly or the back of the fighter jet. Images with a view from the side are more likely to be incorrect.

Most images taken of airliners are views from the side. There is a clear lack of front or rear-view photos for airliners, and a lack of (horizontal) side view for fighter jets in my data.

What do you think? Share you thoughts!

[Edit]: fixed most incorrect xyz labels.

oneironaut · January 10, 2019, 10:42am

Awesome work! How many pictures per class did you use? Also, did you download them by hand or somehow automated it?

ericlin · January 11, 2019, 2:32am

@oneironaut

This is the dataset breakdown:

train
    airliners: 51 jpg images of airliners produced by boeing, bombardier, and embraer.
    fighter-jets: 51 jpg images of western fighter jets such as the f-18, f-35.
test
    airliners: 39 jpg images of airliners produced by airbus, such as the a380, a320.
    fighter-jets: 39 jpg images of russian fighter jets produced by sukhoi and mikoyan, such as the su-35, mig-29.
valid
    airliners: 20 jpg images of airliners produced by tupolev, such as the tu-154.
    fighter-jets: 20 jpg images of korean and taiwanese fighter jets, such as the kai t-50 and aidc ching-kuo.

The files are available here for download (61.7 MB).

The images are collected with google_image_download.

oneironaut · January 11, 2019, 6:34am

One thing I noticed in the example predictions you provided:

It seems that pictures of fighter jets mostly have sky in the background, while pictures of airliners are taken on the ground. Perhaps your model only recognizes those differences in the background and not the actual planes.
It would explain:

All most correct airliners have large ground-parts in their image
All most correct fighter-jets have sky in the background
The most incorrect Airliners are actually fighter jets with ground-background
The most incorrect fighter-jet is actually an airliner with sky-background

This blog post describes the phenomena. Maybe have a look at it?
https://towardsdatascience.com/how-to-spot-data-leakage-thanks-to-heat-maps-81a25f5331eb

ericlin · January 11, 2019, 6:41am

Very interesting! I hadn’t noticed that.

Will do, thanks for sharing!

oneironaut · January 18, 2019, 1:33pm

I experimented a bit with the CAM visualization. It’s a really nice tool for “debugging” models. Could you share your code+data? I’d love to verify if my assumption about the backgrounds is right.

This is the code I used for testing CAM on the bees-vs-ants classifier from the pytorch documentation:
https://colab.research.google.com/drive/1LLoqTpphQ-6cOL4LDVlYiiPOp-nWP1m_

ericlin · January 21, 2019, 2:33am

Sure thing. Though I haven’t implemented CAM to check where the model is ‘looking at’ yet.

The notebook file is here (github).

The dataset is here (dropbox).

auggy · January 21, 2019, 6:17am

Did some of your image files get mixed up? The most incorrect Fighter jets image is an airliner & the most incorrect airliner is actually a fighter jet!

ericlin · January 21, 2019, 3:40pm

Oh wow, how did I miss that!

I’d better go fix it, haha

[Edit] It looks like I accidentally switched up the labels for most incorrect xyz.