I copied the kaggle from the course (cat vs dog), changed it a little bit (link to my version) and I get 0% error rate when showing new images to the neural network. That’s perfect.
Then, I copied it and changed “cat” to “Tom Cruise” and “dog” to “Brad Pitt”:
There are only 2 small differences:
- I implemented a “top center crop” function to avoid removing Tom and Brad heads from the pictures if cropping in the center of the image
- I fine-tuned 5 epochs instead of 2
Except these 2 small differences, it’s 100% the same code.
But now, instead of getting 0% error rate, I’m getting 40%.
I spent hours trying to understand what is going on but I don’t have a clue.
Do you have any clue?
I like your notebook. I am not 100% sure why error rate is very high, but I think the model is having a hard time trying to predict images that have longer height than width.
So, it may due to your “top center crop” function on the test data.
Or maybe it is because train data looked very different from test data.
I think the crop function is fine, you can see what it generates just below the
dls.show_batch code block. Images look fine.
I think the model is having a hard time trying to predict images that have longer height than width.
That was an interesting idea! So on my test set (the 10 images at the end), I also added the
top_center_square_crop function. This way, all input images and the 10 test images are all squares.
This new crop on the 10 test images works well (see at the very bottom of the notebook, the 10 images are squares).
But unfortunately, that changes nothing to the error rate… Still 40%…
Updated notebook: brad pitt vs tom cruise (custom item_tfms) | Kaggle
That is interesting.
I think the problem was that you did not have enough data for training and testing. Using only 10 images for testing is not really good. The more data, the better. Using data augmentation is a good way to compensate for little amount of data. With more data, we can use deeper network, which can provide more opportunities for our model to learn little details.
I played around with it a little bit, and I was able to reduce the error rate to 20% in this notebook. You can take a look at this now, or you can try out on your own and come back.
Here are some changes I made:
- Instead of searching for Tom Cruise with suit, I decided to search for his face.
- So, I did not use the
- I added
batch_tfms to the dataloaders.
- I increased the max images downloaded to 1000.
- I ran the lr_find and reduced the learning rate to 0.001.
- I used resnet50 instead of resnet34.
And, It is down to 20%. There can be more improvements to the model.
Thanks, I would like to have a look at it.
I’m not sure if you clicked “Share” and then “Save” on your notebook because I get the Kaggle 404 error page when trying to open your link.
Sorry, it was private. You should be able to look at it now.
I was able to access it.
You seem to be more advanced in the course that I am because I don’t know about fastai data augmentation and learning rate functions yet (I just read chapter 1 of the book and watched video number 1), but I already heard about these concepts so that makes sense and I suppose I’ll get to it quite soon in the course to get the best out of my neural networks.
Your improvements are very interesting, thank you for taking time to help.
I had fun with it as well. Keep working on the course. It is very fun. And if you need help, community is here for you.