Fixing code for bounding boxes


(Erik Gaasedelen) #1

I have been trying to implement a system for training bounding boxes for my own projects, but was having trouble because the code breaks whenever a bounding box isn’t present or has been pushed off the image by a transformation.

I have fixed this and can now show batches, and the tests pass as well. But I’m getting lost in all the processors and ItemLists, and I haven’t had time to reason why I have reached some of the errors I have. I guess I’d like some code review for what I’m doing to make sure the logic makes sense.

The first problem was with ImageBBox’s create method. When there is just one bounding box, bboxes are supplied as an np.array with an object and list as its sole element. This caused problems later with collation. The solution was just to be an intermediate and enforce that bounding boxes were indeed floating point numpy array. It is unclear to me why the tiny pascal example worked while a separate dataset with same formatting did not.

Next there was a problem with ImageBBox’s data property. When there is no valid bounding box, the labels become empty. In my case I forced it to take on a value of tensor([0]). This feels wrong, but it seems to work fine for me because categories are labeled starting at 1, so 0 worked as an okay dummy value.

Last was a problem with bb_pad_collate which fail if there are missing bounding boxes. My solution was just to populate the bounding box with zeros if there was data provided.

Ideally, having to pass in dummy values shouldn’t be there, but for now this is working for me. But does anyone have suggestions on how to go about improving on this? I’ve spent some time exploring the different ItemLists and trying to follow the logic. Maybe someone can point me to a forum post where this can help make this clearer for me.


#2

Erik, do you have an example of the format of the labels for a bounding box problem?

As to your dummy value problem. Couldn’t you just not include the sample if bbox size is 0?

remove_rows = []
for i, s in enumerate(samples):
  bbs, lbls = s[1].data
  if bbs.size(0) == 0:
    remove_rows.append(i)
  else:
    imgs.append(s[0].data[None])
    bboxes[i, -len(lbls):] = bbs
    labels[i, -len(lbls):] = lbls
# remove rows from bboxes and labels using the remove_rows list
good_rows = torch.LongTensor([r for r in range(bboxes.size(0)) if r not in set(remove_rows)])
bboxes, labels = bboxes.index_select(0, good_rows), labels.index_select(0, good_rows)

return torch.cat(images, 0), (bboxes, labels)