I am beginning with the outstanding fast.ai library and feel that need a little help with my new project. I really would appreciate if anyone could help me with my simple questions (I guess they are).
How could I improve to avoid overfitting if I am already using Data Augmentation?
Why is the plot of the learning rate showing that strange style?
Why is the confusion matrix not showing all the results?
Thanks in advanced to anyone willing to help me.
It’s showing the results inside your validation set, your test set was never used from what I could see
How are you splitting the data? Are the classes balanced? (IE is there an equal amount of samples for each class?) If not you should look into either oversampling techniques or weighted CrossEntropy to deal with this.
Having a larger validation set to work off of should help with this too as it’s only 16 images.
Otherwise you could try TTA (though in practice test-time augmentation takes 4x as long because it’s run 4x as many times)
This may help take advantage of the data you have. From Lesson 3 notes by Hiromis located at https://github.com/hiromis/notes/blob/master/Lesson3.md. You could also watch the lecture 3 video at the time that’s in there to see Jeremy talk about it.
So here’s the trick [51:01]. When I created my dataset, I put
size=128 and actually the images that Kaggle gave us are 256. I used the size of 128 partially because I wanted to experiment quickly. It’s much quicker and easier to use small images to experiment. But there’s a second reason. I now have a model that’s pretty good at recognizing the contents of 128 by 128 satellite images. So what am I going to do if I now want to create a model that’s pretty good at 256 by 256 satellite images? Why don’t I use transfer learning? Why don’t I start with the model that’s good at 128 by 128 images and fine-tune that? So don’t start again. That’s actually going to be really interesting because if I trained quite a lot and I’m on the verge of overfitting then I’m basically creating a whole new dataset effectively﹣one where my images are twice the size on each axis right so four times bigger. So it’s really a totally different data set as far as my convolutional neural networks concerned. So I got to lose all that overfitting. I get to start again. Let’s keep our same learner but use a new data bunch where the data bunch is 256 by 256. That’s why I actually stopped here before I created my data sets:
I have done projects with X-Rays previously and found more lighting transforms helpful overall. I know you didn’t ask for data augmentation stuff, just thought I would throw that out there also
Thank you @muellerzr for your kind answer.
The original Kaggle validation set is surprisingly small. I don’t know why. Having just 16 items compared with the training set of 4500 is worthless. Is this a normal thing in Kaggle competitions? The original structure is:
Test --> normal (234)
After trying also what @Ezno indicated it is impossible to get anything useful. I will try weighted cross entropy but I have never heard of it. I am also trying to balanced the dataset.
Why is not being used the test set? Doesn’t the confusion matrix have to use the test set?
Thanks in advancced.
Thanks @Ezno for your kind help. I am going to try with size 128 and lighting transforms.