RetinaNet for Sorghum head detection

Hi,
I m using Pascal notebook for detecting Sorghum head.

So far, I m able to run the notebook.

I m facing few challenges,

  1. Too many bbox detected. I was able to reduce the no of bbox by playing with the detection threshold. I increased the detection threshold to 0.7
  2. bbox size is very big. Small head is detected with large bbox.
  3. Input image is of diff size. width varies between 250 to 400 px and height varies between 1000 to 1500 px. My current implementation just crops the images randomly and does training over it. I m thinking to resize all images to 128, 256, 512 and train accordingly.

See original vs validation with a detection threshold of 0.7

Any ideas/help on all three pts is appreciated.

if anyone is willing to join the competition, ping me.

Hey there, I’ve actually just been started on this competition as well and I’m using the same notebook as a starting point.

I have a small exploratory data analysis up at: https://github.com/JoshVarty/SorghumHeadDetection/blob/master/00_EDA.ipynb

Some things I think will help:

  1. Smaller anchor sizes. (Avg height and width is 28x28)
  2. Test time augmentation. Scan each image from top to bottom instead of randomly
  3. Maybe creative data augmentation. CutMix might work for object detection if you keep track of the bounding boxes.
  4. Ensembling
  5. There is also an additional unlabelled dataset that might be useful to incorporate somehow, though I’m not familiar with many unsupervised techniques.

I’m interested on working on this competition with you if you’re still looking for a partner.

Hi @JoshVarty, great. We will collaborate.

  1. Smaller anchor sizes. (Avg height and width is 28x28) =>> Not sure how to get exact 28 x 28 images. I m playing scale & ratio to figure out right bbox.
  2. There is also an additional unlabelled dataset that might be useful to incorporate somehow, though I’m not familiar with many unsupervised techniques. ==> I think it is one of the cases where we predict the boxes on unlabelled data set and then use it again for training.

With default notebook values I’m getting mAP ~= 0.02. which is not good. I see that predicted bbox is larger. Maybe using working with bbox will improve it.