Bicycles vs wheelchairs

Hello folks,

I’m trying to build a classifier to distinguish bicycle and wheelchairs

I started googling some images and built a dataset of around 3400 for bicycles and 2700 for wheelchairs

I’m trying to get the best possible performance given the lack of training data, what I’ve done is as follows:
1- I used Yolo to do detections on the dataset and considered the detections as new images, so I can increase the dataset

2- assigned 2500 images to the training set for both classes and the rest is for validation.
Q1: the validation accuracy isn’t a good measure now as the validation set are not balanced right? what is the substitute?

Q2: no test set required? how can i measure the generalization performance?

3- performed data augmentation on training set only, and left the validation set alone
Q3 the videos say that this step is for preventing overfitting, but for me it’s for increasing data size and preventing overfitting.

Currently, the bottom performance is for AlexNet trained directly on images without data augmentation with classification accuracy of 90%

then i finetuned VGG16 for bike vs wheelchair and the current performance is around 94%
Q4 is finetuning imagenet classifier is enough for this task as i think wheel chairs isn’t one of the classes in imagenet, i thought i might after training the last layer, i should finetune the last conv layers with a very small learning rate.

I need to achieve performance above 98%, what do you suggest?

Thanks in advance

For metrics, have a look through and see which approaches look most useful to you.

You don’t necessarily need a test set, but certainly a validation set is a good idea.

For very small datasets, densenet is good. This is discussed in lesson 13.

I’d certainly suggest fine tuning a pretrained net.

1 Like

Do you suggest that after retraining the last FC layer with my dataset, do i need to start finetuning the last conv layer and also the layers afterwards a little bit with a small learning rate for a few epochs?

I’ll have a look at lesson 13, Thank you so much for your efforts.

last question, regarding the numbers i mentioned 3.5 K vs 2.7 K, this is the range you’re talking about in lesson 13, am i right?

Part 1 of the course shows lots of examples of when and how to choose layers to finetune, so just follow the process from there.

1 Like

Good way to classify the difference between the Bicycles and Wheelchairs and i suggest to apply all the steps for this test.