I tried to recreate the work done using SSD object detection on the pascal dataset using @rohitgeo and @joseadolfo excellent notebooks as a reference. Thank you to both. This was immensely useful in understanding the concepts.
- https://github.com/rohitgeo/singleshotdetector/blob/master/SingleShotDetector%20on%20Pascal.ipynb
- https://github.com/jav0927/course-v3/blob/master/SSD_Object_Detection_RS50_V3_0.ipynb
However, when I train my models I end up plateauing at a specific loss. I’ve tried multiple approaches to train
- Train the last layers
- unfreeze the last 2 layers followed by unfreezing everything.
Also, I tried the approach where I started with training the unfrozen last 2 layers.
When I try to approach as in @joseadolfo’s notebook I end up overfitting with the model only predicting person. My understanding is that there are lot more samples of person but I also notice that in the sample notebook there is a steady fall in training and validation loss, however, I end up with a steep drop early on and plateau. I tried training it for 120 epochs before I realized that the overfitting could’ve affected the weights significantly.
One difference in my approach was not choosing to divide by 224 as in @rohitgeo’s notebook which worked well when I attempted SingleObjectDetection using 4x4 grids alone.
I’ve tried approaching it with discriminative learning rates but it ends up with similar overfitting.
Here is the notebook I’ve been working on.
I’m out of ideas on what I should be trying to get different results. Would really like some feedback and suggestions.
The thing I do notice is that some of the predictions do include the right categories but get masked when I run class_scores.sigmoid > threshold
. Lowering the threshold also ends up adding a lot of other noise.