Why Jeremy used Vgg for cats vs dogs, but didn't for state farm ?`

I’m looking for some intuition, for cats vs dogs Jeremy chose to use VGG16, but for state farm he didn’t and created his own network.

One can argue that the reason is that VGG16 was trained on cats an dogs (among other things) so it is appropriate for cats VS dogs, but he wasn’t trained on different kinds of human behavior (like driving, texting, eating).

But I’m not sure that the previous argument stands, because if I remember correctly Jeremy stated that the convolutional layers of VGG16 are “Generic image descriptors” and should apply to any kind of image.


1 Like

According to the statefarm notebook, Jeremy did use VGG for the Statefarm kaggle competition.

From skimming the notebook, it seems Jeremy started with simple networks to see which changes had which effects on performance.

Some headings:

  • Single conv layer
  • Data augmentation
  • Four conv/pooling pairs + dropout
  • Imagenet conv features
  • Batchnorm dense layers on pretrained conv layers
  • Pre-computed data augmentation + dropout
  • Pseudo labeling
  • The “things that didn’t really work” section

You can see from the headings that he’s trying many things and many combinations of things. He even added a section talking about the things he tried but that didn’t work and weren’t worth focusing on in the lesson.

Whenever you see “Imagenet conv” or “pretrained conv” when working on an image task, there’s a good chance the author is using VGG. (Although I wouldn’t be surprised if this statement becomes false in the next six months, because of the community’s research speed).

Edit: Maybe you were looking at the statefarm sample notebook. There Jeremy and/or Rachel even tried a linear model, likely to get a benchmark and to test the piping of the system.

You are correct, my mistake.
I asked the question in the begining of the lecture, later he actually used VGG16 :slight_smile: