Fisheries competition: about setting bounding boxes of NO_FISH to [0,0,0,0]

Hi everyone.
I’m trying a similar approach to the one used from Lesson 7 in the fisheries competition (predict a bounding box and classify it at the same time).
However, I see that on images in which the net is quite uncertain, predicted bounding boxes move towards the origin of the image. This is because the net has learned that the No_Object class is at [0,0,0,0], in my opinion.

I did play around a bit by changing the value of the bboxes for the “none” class, but without inducing any bias on you: what do you think the best approach is?

Intuitively: if a Bbox is stored as x,y,w,h with an image of size r,c

  • Setting it at 0 0 0 0 would shift uncertain classes towards the origin
  • setting it at r/2, c/2, 0 0 would shift towards the center
  • setting it at 0 0 r c would increase the size of the uncertain bboxes
    etc etc…

I thought about this a bit and I am not sure the approach taken makes much difference here.

I can see how the decision we make can impact predictions on positive examples, but probably just having this effect is an unfortunate ‘bias’ of this architecture. Not sure a good choice can be made here nor that it can have a significant impact.

In general - and take this with a grain of salt please since I am redoing the part 1 of the course now myself and only on 2nd lecture - I think that this outputting of bounding boxes coordinates in lecture 7 is just a fun example of using a NN and shows how flexible they can be, but I don’t think this is a genuinely good approach to getting good bb predictions (especially that IIRC we are only using those coordinates to provide our network with more information to help with predictions across the categories). I have not worked through the 2nd part of the course just yet but I would guess that better architectures for image segmentation are discussed there as I saw here on the forums people mention RNNs, etc.


First of all, thanks for the time spent thinking on this!
Yeah, I’m quite doubtful about how to proceed for this bounding box estimation problem. I’ll see, if anyone else wants to tell us their opinion, that would be great!