I participated in a scene classification hackathon and using the fastai library I was able to do pretty well and get in the top 5% or so. However, my model was getting confused a lot between some classes (mountains vs glaciers and buildings vs streets) Now although these classes can easily be misunderstood, what can I do if I have to improve my model? I’ve tried progressive image resizing which helped improve the accuracy a bit but the confusion still exists.
One idea that came to my mind was to train a model using only the classes where it doesn’t get confused and then later add the other ones but that would cause a problem in the neural net output since first there might be 4 neurons and later we will require more.
Any suggestions or feedback would be handy
I’ve been thinking about the same for sometime now.
I thought about combining each group of those confusing classes into one class in the original model. Then, train a model on each group of classes that will be trigered after the original model.
This is still just a thought. I didn’t do any expierments yet to see how feasible is this and if the individual models can really better diffrentiate between these confusing classes.
Looking forward to hear more ideas!
Have you tried penalizing those specific classes more? Once you get a good separation you can turn the penalty of and train with a lower learning rate.
Hey @Wesley, I haven’t tried penalizing the classes more. I’ll try to do that and get back. Thanks
Do you plan to give it a try? I will give it a try if I get the time. However I feel, since the model is getting confused between them anyway, using a separate model for them won’t solve the issue. I believe something has to be done with the data. For example, the competition I took part in, my model got confused between streets and buildings. This was because pictures of buildings had streets and pictures of streets had buildings in the background. There must be some other way around this.
Have I understood correctly, that you train a model to classify images, that means each image gets 1 class? If so, it is quite obvious that there will be confusions, if an image shares classes of the model.
Maybe you could do multilabel classification or segmentation and use the output to decide, which is the most probable class?
Yes but the challenge wasn’t a multi label classification challenge. Nor was the training data in that format. Check out the hackathon details here. I just checked that 2 of the top 50 participants have shared their approach. They’ve used the fastai library as well. Sharing their solutions here:
Luckily I’ve got some interesting result using a kind of “Curriculum Learning”, feeding the model with high frequency classes first and low frequency later (using only frequency is very naive approach as @aamir7117 told me a lot of times ).
Focusing on samples where the model is not “confused” is way better and actually pretty simple to implement if you feed your own classes in the labelling step.
Train first with all data and measure the performance.
Pretend your model is good on classes A.B but confused on C,D,E. Then:
- Create classes list
allClasses = ['A','B','C','D','E']
- Create databunch for high frequency classes
dataAB = (ImageList.from_df(dataAB, path, cols=['Image'])
.label_from_df('LBL', classes=allClasses) # THIS IS THE IMPORTANT STEP!
- Create the learner with dataAB
learn = cnn_learner(dataAB, models.resnet50)
- Train the model (should have higher accouracy)
- Create databunch for all classes (dataAll)
- Change the databunch in the learner
learn.data = dataAll
- Train with all classes