How to take advantage of additional data?

0 down vote favorite

I am trying to train a 3 class problem object detection problem,For these 3 classes i have around 9000 samples each.The model is performing decently but there is still confusion between classes.

I have additional data for these classes but this data is NOT annotated, i only have the images, i could probably annotate them for what class they belong to, but annotating bounding boxes will not be possible.

is there something i can do to take advantage of this data that i have?may be some pretraining or something, im not really sure.

Any suggestions will be helpful.Thanks in advance.

Maybe you could try to do something similar to pseudo labeling: Training the model on the annotated images and then predict the additional data and use the predictions as labels and retrain the model on the old data + the newly “annotated” data

1 Like

Yes, I’m trying that, but my concern is that because ill only use good predictions by the model, i wonder if the model will learn anything new!

You can do active learning to select which new data to label. https://en.wikipedia.org/wiki/Active_learning_(machine_learning)

To use the data directly, the following paper describes some techniques: https://arxiv.org/abs/1412.6596

1 Like