I’m looking for some intuition, for cats vs dogs Jeremy chose to use VGG16, but for state farm he didn’t and created his own network.
One can argue that the reason is that VGG16 was trained on cats an dogs (among other things) so it is appropriate for cats VS dogs, but he wasn’t trained on different kinds of human behavior (like driving, texting, eating).
But I’m not sure that the previous argument stands, because if I remember correctly Jeremy stated that the convolutional layers of VGG16 are “Generic image descriptors” and should apply to any kind of image.
According to the statefarm notebook, Jeremy did use VGG for the Statefarm kaggle competition.
From skimming the notebook, it seems Jeremy started with simple networks to see which changes had which effects on performance.
- Single conv layer
- Data augmentation
- Four conv/pooling pairs + dropout
- Imagenet conv features
- Batchnorm dense layers on pretrained conv layers
- Pre-computed data augmentation + dropout
- Pseudo labeling
- The “things that didn’t really work” section
You can see from the headings that he’s trying many things and many combinations of things. He even added a section talking about the things he tried but that didn’t work and weren’t worth focusing on in the lesson.
Whenever you see “Imagenet conv” or “pretrained conv” when working on an image task, there’s a good chance the author is using VGG. (Although I wouldn’t be surprised if this statement becomes false in the next six months, because of the community’s research speed).
Edit: Maybe you were looking at the statefarm sample notebook. There Jeremy and/or Rachel even tried a linear model, likely to get a benchmark and to test the piping of the system.
You are correct, my mistake.
I asked the question in the begining of the lecture, later he actually used VGG16