[Project] Stanford-Cars with fastai v1

morgan · January 7, 2019, 8:13am

First notebook in a series on image classification for the Stanford-Cars data using the fastai v1 library.

Goal is 90%+ accuracy, I’m at 84.95% with this basic version, without any fancy tuning at all!

Amazing to see how small the differences between car models are that the model gets confused by.

Would love to hear any thoughts, questions or feedback!

morganmcg1/Projects/blob/master/stanford-cars/1_stanford_cars_basic.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Stanford Cars - Benchmark Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook has the benchmark model for my Stanford Cars predictions.\n",
    "\n",
    "#### TL;DR\n",
    "- Accuracy of **84.95%** on the test set (96.31% Top5 Accuracy)\n",
    "- Did not use the image annotations in this benchmark\n",
    "- Used the standard Fastai v1 image transforms\n",
    "- Used `lr_find()` to find the optimal learning rate and `fit_one_cycle` for training\n",

This file has been truncated. show original

gabriel_chen · February 3, 2019, 5:35am

Hi Morgan!

I also made my first notebook on the Stanford-Cars dataset after watching lesson 1. I got the accuracy around 85% too.

If you can share the updated link for your notebook, I would really appreciate it. Currently, the link you shared is broken and shows 404 error.

morgan · February 15, 2019, 7:52pm

thanks @gabriel_chen for pointing that out, I fixed the link in the original post, here it is again:

https://github.com/morganmcg1/Projects/blob/master/stanford-cars/1_stanford_cars_basic.ipynb)

Would love to hear how you’re doing on it, I’m up to 88%, still a bit far from my goal of 90+%!

morgan · February 24, 2019, 10:31am

Latest results for Stanford Cars image classification up to 89% using Mixup

github.com

morganmcg1/Projects/blob/master/stanford-cars/4_stanford_cars_mixup.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Stanford Cars - NB4: Adding Mixup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A dive whether using the [Mixup](https://arxiv.org/abs/1710.09412) data augmentation technique can improve model accuracy.\n",
    "\n",
    "### TL;DR\n",
    "- The highest test set accuracy of **89.4%** was found with Beta=0.4, the default paramter for fastai's Mixup and the recommendated value from the original paper.\n",
    "- To achieve this I found that it was critical to train a final round **without mixup** in order to fully realise Mixup's potential. Test set accuracy after Stage 4 (no  non-mixup training) was around **87%** for both Beta=0.4 and Beta=0.3. After the additional non-mixup training this accuracy **jumped to 89%** This was not noted in the Mixup paper or in Fastai's documentation.\n",
    "\n",
    "**Beta=0.4**\n",

This file has been truncated. show original

Slowly getting to that 90%+ goal!

charming · February 25, 2019, 8:16am

nice work, I learned a lot here

umaidzz · July 9, 2019, 11:16am

Can anyone please tell me how are you guys able to make the databunch because i cant get the classes right because class 1 is supposed to be hummer SUV where as in the images its actually some audi sedan here are the images.
Here are the classes:

The 000001.jpg has the first class:

Whereas actually its this:

morgan · July 13, 2019, 11:48am

Yep I had trouble with this too, @jianshen92 has some nice code here (cleaner than mine ) which should help with your data import :

github.com

jianshen92/stanford-car-grab-challenge/blob/master/Stanford Car Model Training.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Stanford Car Model Training\n",
    "\n",
    "Model is built with fast.ai v1 and PyTorch v1, trained on Google Cloud Platform's Deep Learning VM with 16GB NVIDIA Tesla T4.\n",
    "\n",
    "Data consist of 8144 Training Images (80:20 Train:Validation Split) and 8041 Test Images. \n",
    "\n",
    "Architecture used is ResNet-152 with squared image (299x299), pretrained with ImageNet. Data is augmented with several affine and perspective transformation. Mixup technique is used. \n",
    "\n",
    "Final Top-1 Accuracy is **92.53%** on Test Images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,

This file has been truncated. show original

jianshen92 · July 13, 2019, 1:57pm

Thanks for quoting @morgan. This project was an entry to an AI challenge by Grab, which is the Uber here in Southeast Asia. This submission made it to the top 50, so I’m very grateful for Jeremy and team for creating this library!

Happy to answer any questions regarding my notebook!

morgan · July 13, 2019, 6:23pm

Nice work! I’m going to run your code in the next day or two and see how it performs, I had been struggling to get over 90%!

morgan · July 15, 2019, 2:36pm

Hey @jianshen92, in your notebook above was there a specific reason you trained your model in bursts of 5 epochs? Instead of maybe just doing fit_one_cycle for 25epochs for example? It seems to have worked well so I’m just curious if there is a little trick that I don’t know about

jianshen92 · July 15, 2019, 5:08pm

@morgan It is something I found out empirically and I have no strong mathematical explanation for this

When I was experimenting I trained every cycle with 5 epochs because I wanted to try to tune the parameters after every checkpoint to see if the model will improve further without waiting for too long.

One thing for sure though, fitting one cycle for many epoch vs fitting many one_cycle with less epoch, is not exactly equivalent, because of the cyclic momentum and cyclic learning rate within a cycle. Perhaps fitting it for many cycles enable it to search deeper to the hyperspace’s minima because of the multiple cycle of increasing and decreasing learning rate.

Just a guess anyway!

morgan · July 15, 2019, 6:21pm

Nice! I get the practicality of it, I’ll be sure to remember to try it in other projects! I’m trying to run your notebook now, starting with Facebooks’s ResNext 101 WSL model (trained on 840m Instagram images and fine-tuned on Imagenet) weights, but I’m hitting OOM errors at the moment. Will keep trying reducing the batch size and/or use a smaller model version

jianshen92 · July 15, 2019, 6:38pm

I tried with ResNext too but just with the one provided by torch.vision. End up not using it because improvements are very minimal, like .05 % increase in accuracy but train much longer.

Let me know how it goes for you!

morgan · August 20, 2019, 9:00pm

Updating progress here, current accuracy at 93.29% (with TTA), 93.19% without. Using the SQUISH resize transform as well as paying more attention to the LR at each stage really helped here!

github.com

morganmcg1/Projects/blob/master/stanford-cars/7_stanford_cars_mixup_resizeSQUISH.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Stanford Cars - NB7: Mixup + Resize=SQUISH"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TL;DR\n",
    "\n",
    "- Test set accuracy was **93.19%** (with TTA), my highest yet. Accuracy was 93.17% without TTA.\n",
    "- Top5 Accuracy was **98.93%** \n",
    "\n",
    "These results were in large part thanks to @jianshen92's excellent [implementation for a Grab competetion](https://github.com/jianshen92/stanford-car-grab-challenge) (Grab is basically Uber for SE Asia) which by using **Squish resize** resulted in a big jump in performance.\n",
    "\n",

This file has been truncated. show original

jianshen92 · August 26, 2019, 12:02pm

Hey congrats for pushing it up to 93%. As far as i know, the SOTR accuracy for this dataset is at 94.8% according to here https://paperswithcode.com/sota/fine-grained-image-classification-on-stanford

morgan · August 26, 2019, 8:35pm

Thanks! Yep I just started playing around with EfficientNet to see how close I can get to their SOTA on stanford cars. EfficientNets seem tricky to train though, have you tried them? will keep playing around

bluesky314 · August 27, 2019, 1:54pm

Thats awesome! Can you describe abt what you mean by “paying more attention to the LR at each stage really helped here!”? What did you observe and then change?

morgan · August 27, 2019, 8:04pm

Thanks @bluesky314, just that I trained for a smaller set of epochs (5 or sometimes 10) and then ran lr_find() again to see if it suggested I needed to change the learning rate, I more or less copied how @jianshen92 had done it

jianshen92 · August 27, 2019, 9:35pm

Haven’t tried the EfficientNet, haven’t been working on this project ever since the competition has ended. I admire your perseverance on this!

Recently there is this Radam and LookAhead optimizer that just came out and the community almost says that it guarantees improvement in accuracy and allows us to use a higher learning rate. I think it is worth a try!

morgan · August 28, 2019, 7:00am

Thanks @jianshen92, I think I am just too unimaginative to try something else Have been trying Radam + LookAhead (Ranger) alright, RangerLars looks like it should be better again from the discussion here: https://forums.fast.ai/t/imagenette-woof-leaderboards-guidelines-for-proving-new-high-scores/52714/57:

RangerLars: https://github.com/lessw2020/mish/blob/master/rangerlars.py

Have you seen this new Mish activation too? Going to swap it in the EfficientNet and see how it goes: Meet Mish: New Activation function, possible successor to ReLU?

Crazy time for new techniques!!