Lesson 3 VGG-style CNN with batchnorm: 93.51%
Notes: (1 epoch) (2016-01-17) (see TODO) (no fine-tuning; i.e. random initial weights)
Similar to the CNNs in the lesson 3 MNIST notebook
Lesson 3 VGG-style CNN with batchnorm: 97.58%
Notes: (30 epochs) (2016-01-18) (the same as the above network)
Image format caveat
The images are in the PPM format, which appears to be incompatible with Keras. I used the following code to convert them to PNG. It took about 45 seconds. The first wildcard is for the image folders and the second is for the images:
mogrify -format png */*.ppm
Image “track” caveat
For each physical sign recorded, thirty images were taken. Each set of images is called a “track”. It’s important to keep the images of a track together when splitting the data into a training set and a validation set, otherwise the validation accuracy will greatly overestimate the test set accuracy.
Jupyter collapsible headings
It’s convenient to be able to collapse sections of a Jupyter notebook. Here’s a way to enable this feature:
conda install -c conda-forge jupyter_contrib_nbextensions
jupyter nbextensions_configurator enable --user
Go to http://localhost:8888/nbextensions
Check the box next to “Collapsible Headings”
An example reading of the chart:
The model labeled the 50km/h sign as the 70km/h sign 87 times, but labeled the 70km/h sign as the 50km/h sign only 18 times.
From lesson 3, I’d say don’t just look at validation accuracy. Compare it to the training accuracy to see whether you’re overfitting or underfitting, and then act accordingly. See lesson 3 on how to deal with either situation.
Update: From 62.15% to 93.51% accuracy
Differences:
Instead of fine-tuning VGG, I used the VGG-style CNN with batchnorm from the MNIST notebook.
I think this worked better because traffic signs are must simpler than cats, dogs, flowers, and so on. This allowed for a simpler network like the one Jeremy and Rachel used for MNIST. The simplicity of the network allowed me to retrain all the filters instead of fine-tuning them. One epoch from random weights took 51 seconds.
Only did one epoch! (see TODO) (learning_rate == 0.001)
TODO:
Apply more lesson 3 skills/ideas/tips/tricks:
Do more than one epoch!
Use the learning rate pattern that Jeremy used (0.001, 0.1, 0.01, 0.001, …)
Data augmentation
Dropout
Ensemble
Input normalization (I didn’t subtract the mean or divide by the standard deviation)
Update: From 93.51% accuracy to 97.58% accuracy
Differences:
30 epochs instead of 1 (same learning rate, 0.001)