Improve my CNN network (image classification)

Hi all, I am working on an image classification challenge and currently hit a deadend. My data consist of 800+ images in training and 3 classes. The test dataset doesn’t have labels and we can only get the score after uploading it. The scoring metric is log loss. Here’s the architecture and process which I have followed which got me a score of 0.34 whereas the top score is 0.14. I just want to know what more I can try and since this is my first challenge I want to learn more.

My architecture:

Using fastai with transfer learning and progressive resizing on a resnet50 model I got a score of 0.37

Ran the model for densenet and efficientnetb4 and later ensembled their scores to get 0.34

Progressive resizing I did on size 32 and later went on till 224 image size Also, I have tried to use the albumentation package.

No matter what I do now, I can’t get the score up. Is their any other way which I can follow? I tried using keras and fine-tuning but that got me till 0.6 for a single model and when tried with pytorch and transfer learning the results were pretty bad.

Any help would be really appreciated.

One other approach i tried is cleaning the data by identifying images which are confusing using top losses from fastai and removed some duplicate images.
I am currently totally out of any more ideas :frowning:

There are folks who got good scores like 0.2 or even 0.14
i just want to know if there is any other approach which i can try

I think the key is figuring out what the problem is. Here are some questions I would ask.

  1. Is your training data set balanced? Or is it skewed towards one of the classes?
  2. What is your validation accuracy versus your training accuracy? If the validation accuracy is a lot lower, consider using regularization.
  3. Manually look at what images your architecture is getting wrong, and categorize them. For example, you already removed duplicates. Another error might be your NN can’t handle light images well. Afterwards, try to tackle the top “problems” and work your way down.

Hi Daniel,

  1. It is not balanced but also not highly skewed towards other classes (any option to add weights for classes… i know how to do that in keras but not in fastai)
  2. i noticed that the validation accuracy is lower than tr accuracy so currently running a model after changing parameter (true_wd=False so that it adds l2 regularization)
  3. I am not sure as in how to tackle such scenarios, in terms of miss-classification i manually edited some classes but how to handle the data in such scenarios?

Hi Akshat,

I’d recommend these three videos

  1. Train/Dev/Test Sets
  2. Bias/Variance
  3. Basic Recipe for Deep Learning

If you already understand mismatched training/validation/test tests, and high variance vs high bias, then skip to video 3. Focus on video 3.

After reading your reply, some thoughts popped into my mind.

  1. In order for people to help more, I think you need to give more details. I think these are general things people post when looking for help. Is it possible to see a couple examples of your data set? Also, could you post your training + validation accuracy curves? What are the training image pixel sizes?
  2. Is your training accuracy at least lower than 0.2 or 0.14? If your training accuracy is not lower than these numbers, then your validation + test error most likely won’t be lower. A general method is to get your training accuracy very low (with a bigger network), and then work on preventing overfitting (regularization or more data).
  3. Take out the L2 regularization for now, and use data augmentation to get more data. I think people tend to not use L1/L2 regularization for image classification, and go for more data to prevent overfitting. From my experience, data augmentation helps a lot when you don’t have many images. One thing to note, you should be a little careful when using data augmentation. Think about what you are trying to classify, and how the augmentation is affecting the image. For example, if you are classifying text, you do not want to flip images horizontally, because then your “b” looks like your “d”. Similarly, for MNIST numbers, you don’t want to flip numbers vertically because a “6” looks like a “9”.
1 Like