I have a question about training a model. If even experts aren’t able to accurately predict a category with just images (ie certain dog breeds that look very similar), is there any hope for ever building a highly accurate classifier for it? Should a different approach be used instead? Using the dog breed example again, maybe incorporating weight data or something distinct between two very similar looking dog breeds?
From previous iterations of the course I know that there’s always what feels like a huge hump when we get to chapter 4 (or its equivalent in pre-book courses). My question would be a practical one: what’s the best way to practice with the concepts / contents of this lesson so that we really grasp what’s going on?
My own initial answer would be:
read the notebook / run the cells
recall the main high level steps to what we did + why we did it in that chapter 4 notebook
try to replicate chapter 4 notebook from scratch without referring to the original
From previous experience I remember I found this hard, so I’m interested to learn what other things I might do in addition to help with understanding / learning.
I highly recommend looking at the Khan Academy material on derivatives and matrix multiplication if (like me) you left calculus back in high school. Getting a solid understanding of those concepts was important foundational knowledge for other concepts I encountered in this chapter.
Thank you. I remember the instruction to take a look at those, but it wasn’t clear to me where to start and stop with those lessons. I felt like I could lose a week just working on those Khan Academy materials…
Do you mean like finding new classes in an unsupervised manner?
You could take a pretrained classifier, applied to your new dataset to get features and cluster those features. If you search up “unsupervised image classification” a bunch of papers and resources come up on SOTA methods for doing this:
I always recommend actually working on some real datasets, especially Kaggle competitions.
When I took the fast.ai course for the first time, I basically went through Kaggle to find an interesting dataset for each task covered in the course. Being someone with a biomedical background, I focused on biomedical datasets. For image classification, I worked on diabetic retinopathy dataset, for segmentation, I worked on ultrasound segmentation, etc. I also focused on ongoing competitions. I applied the tabular module to an earthquake prediction competition. I selected a bad model (just a public notebook ) for final submission, but the fastai baseline model would have gotten me a silver medal in the competition!
I especially focused on the diabetic retinopathy dataset, trying out different things that were discussed in the class (so of course the whole fine-tuning process, making a demo, then trying to improve the data, etc.). A couple months later, I was very lucky that a new diabetic retinopathy competition started and after teaming up with someone, got my first Kaggle silver medal.
My suggestion is to dig into the code. I learned a lot about what’s going on by digging into the fastai codebase. And many of my questions would be answered after walking through the code. Another useful thing is to see how the code corresponds to the math behind some technique. Jeremy will mention this frequently as well, but a lot of techniques with complicated math or terminology will turn out to be fairly simple code snippets.
I don’t know if I answered your question exactly but hope it helps nevertheless
Thank you for taking the time to answer. 100% I find that useful for every class apart from chapter 4
That chapter in particular wasn’t tied as much to the end-to-end process as it was to understanding what’s going on with the model training part. So I’m specifically asking on how to get better at things like being familiar with how to use broadcasting / working with tensors, SGD and so on.
Have you tried doing the questionnaire and further research prompts (without looking at the book while doing it)? I personally found them quite helpful for solidifying my knowledge, even as someone experienced with much of this content.
That’s a great idea I’ll add to that list. I always forget about the questionnaire. I might try pairing it with writing some short blogs that tackle the answers to the questions, since I generally feel like I learn better when I try to write things down to explain them. Thank you!
is this a problem? how to prevent these type of ‘wrong’ classifications?
the solution i see is to use prediction confidence score, and if it’s lower than threshold then predict ‘unknown’. any ideas? solutions from production? Thanks!
For some reason, those two images are not loading for me. Do you mind sharing screenshots instead? I might note that in your code, something seems a little off because when you did dls.show_batch() only quartz images are showing up. And also the confusion matrix fails to show up properly?
It is not impossible for a neural network to successfully learn things that even an expert cannot find. There are many examples of this that I am aware in medical datasets, like predicting cardiovascular risk factors from images of your eyes.
Broadly, getting more data for harder class types is a major strategy. When worrying about ‘can it tell the difference between stuff that I find hard’ keep in mind that the model will find details that are potentially almost imperceptible to the human eye (an example would be in how they can recognise tumor types). But yes, with enough data, there should be hope to train an accurate classifier.
It’s two very similar images but different minerals. Not necessarily like “predicting cardiovascular risk factors from images of your eyes.” Maybe like if there were 2 different types of cells that looked almost identical? Hard for an expert to tell from just one image.
One thing you can do is train a new model with a “none” class with images that are not part of any of the other breeds… Alternatively, you can use a sigmoid activation at the end instead of softmax.
There are also various “out-of-distribution detection” methods that you could maybe look into…
Although these photos look similar to us, there are many small scale details here a model can work with. It isn’t necessarily different from telling the difference between cell types - it will boil back to learnable convolutional filters. I would think, with enough data, a model could tell the difference between these rocks .