Hi Friends - Am working on a Deep learning project - image classification, just for me to learn the concepts and to get hands on, as well. I was trying out the same, and I could not get my validation accuracy to boost up. It lingers around 50% only. Training accuracy is good, around 99%
Tried the following, in combination with data augmentation
VGG model as such, batch normalized.
Transfer learning with VGG, by chopping off final layers, and adding my own layers and retrain them, with 2 different options
Tried VGG in step 1, with larger image dimensions, instead of 224*224
A custom model from scratch (This takes a long time for each epochs, but still validation accuracy is poor)
VGG model removing last layer and adding a new layer to convert to my set of classes, similar to Lesson 2 experiments.
I must agree that my data set is not huge and I have about 1500+ for training and 1000+ for validation and I have around 18 classes.
I tried doing confusion matrix etc and I could not deduce much, since am pretty new to this field.
I tried to implement pesudo labelling, but struck with MixIterator class and getting error that ‘TypeError: ‘MixIterator’ object is not an iterator’.
Any help in making this a better model, is very much appreciated. Pls let me know if you need more details.
well it depends how you did it. if you just copied the bottom 40% of the data in a new folder, chances are you didnt split it with an even distribution. there might be a few classes that only appear in the training set and other that only appear in the validation set. i dont like to worry about this, so i usually load all my x and y data, shuffle it and then split it into train, validation and test. be sure to shuffle your y data in the same way you shuffle x.
perm = np.random.permutation(len(x))
x = x[perm]
y = y[perm]
although both sets have 18 classes the distribution can still be very different. you also want to make sure the subfolders in train and validation are named precisely the same, so keras can make the right associations when creating the labels.
in either way id recommend doing some data exploration as the very first step. look at the data, display a few samples, check the labels, look at the distribution of the dataset, make sure its well balanced, never expect the data is already shuffled.
I just took a quick look at your code without reviewing the data carefully. Some quick thoughts.
You’ve tried a lot of different approaches. You can get really good training accuracy with some models but you’re overfitting (your validation accuracy is never gets to 70% if I’m not mistaken). That’s a good start.
Given this, I’d throw more data at the problem. Data from different sources would be especially helpful. You’re hoping that all your photos of 3 Series BMWs (or whatever) are not always from the same position. YouTube could be your friend here. Download some car videos and split them into single frame images.
To get a sense of what’s reasonable look at object classification papers and see what sort of accuracy they achieve on similar tasks. In your case you have a small number of very visually similar classes and not much training data.
Related to my last point, I’m not sure the feature detectors in a pretrained VGG model are going to be perfect for this task. I haven’t looked at your classes but there may be very subtle differences between makes of cars that later convolutional layers may not have been tuned for.
@telarson - thanks for your detailed response. After suggestions Fron @pietz, my validation accuracy has improved. I will look at options to add more data. I am trying data augmentation but that is kind of decreasing the accuracy of training and validation. Am trying different learning rates too.
On to your last point, I tried a network from scratch as well. I will train that network on new data and see how it improves. Since there are very minor differences between the models, am not sure if there is any recommended approach for the same. I checked a paper and it mentioned about transfer learning from ore trained network. I already tried that and I need to do the same with my new data.
If there are any known ways to train this kind of models, where there are subtle differences, pls let me know. I thought this was very similar to the kaggle fisheries competition.
Since VGG can detect coupes, sedans and SUV, is there any way I can use those trained layers alone and fine tube to identify models? Sorry am new to ML and am just throwing something up. Pls ignore if does not make a sense.
that sounds more like it. i agree with @telarson that theres also quite a bit of overfitting involved, but the way your results were stuck at 50% very early on, suggested something was weird with the data.
VGG16 is awesome. its actually my favorite transfer learning model, because everything that followed seemed extremely specialized on image-net. that being said, you need to understand its a hundred million parameter type of model. so overfitting can happen fast, especially if your dataset is small.
last but not least, youre asking the network to solve a rather complex problem. (if i understand your code correctly). dont expect any magic. everything above 90% should make you suspicious.
The fisheries competition has twice as much data and half as many classes as you do.
With respect to VGG you’re going to activate different feature detectors for fish vs. cars. I think this makes it hard to draw an apples to apples comparison but I’m just speculating.
Last I looked at your code you were already finetuning VGG but I believe your finetuning failed as you are getting an accuracy that’s the same as random chance. Most of my errors like this are related to shuffling batches when I shouldn’t. @pietz already mentioned shuffling as a potential problem.
I don’t know if it makes sense to pop off more then the top layer of VGG and then fine tune the layers below. I suppose it would help prevent overfitting.
I still feel like you should try to get more data while you wait for your models to train. You could also look into other CNN methods that train well on less data. Some are mentioned in part 2 of the course.
@MLNewbie have you read any of the papers on “fine-granded” recognition? There’s some research that has come out of Stanford I saw when reading the MobileNets paper from Google. It seems like it might be helpful but I haven’t looked at any of these resources yet.