For Lesson 1 I uploaded my own images of lions and bears - 12 images for training and 8 images for validation for each class. I used 3 epochs and kept the learning rate at 0.01. The predictions were just above random guessing. When reviewing the results I noticed that attributes of the images like color and teeth might have been confused by the model. For example, the dark fur on the lions belly could have been mistaken for the dark fur on a bear. This was just a guess.
When I increased the learning rate to 0.1, the predictions improved to 87% accuracy. Increasing the learning rate to 0.15 proved to be the best with 100%. So, it seems like larger steps proved to work best for this smaller dataset.
I suspect the mechanism is different. Even with a batch size of 1, with 3 epochs, the model would have very few opportunities to learn. Hence the more learning it can do at every opportunity it gets, the better off it is going to be in the end.
This sounds like a fun project. You got 100% on the validation set or the test set? Would you be willing to share the pictures? I wonder how similar within classes to each other they are.
Probably to explore the relationship between the learning rate and how the model learns moving to a bigger dataset might be a good idea. If you’d rather stick to smaller datasets (I certainly am a big fan of smaller datasets / models) I am getting a lot of mileage out of CIFAR-10. Jeremy outlines how it can be used with the fastai library in lec #7. Might be something worth taking a look at when you have a sec.
FYI: Empty plots may be caused by a batch size that is too small.
I encountered this once when doing some image classification work on my laptop (using a 960M card) and a small batch size because of my GPU. Moved to AWS, ran again with reasonable batch size, and m.lr_find() worked again.