Hey there !
I’m trying to use a DataBunch for Object Detection but I have a weird bug going on. I’ve spent quite a bit of time trying to debug it, but to no avail. Hopefully someone here could explain it to me.
I’m creating an ObjectItemList
from the pascal VOC 2007 dataset, using the following code :
get_y_func = lambda o:img2bbox[o.name]
data = (ObjectItemList.from_folder(JPEGS_PATH/'JPEGImages')
#Where are the images?
.random_split_by_pct(seed = 0)
#How to split in train/valid? -> randomly with the default 20% in valid
.label_from_func(get_y_func)
#How to find the labels? -> use get_y_func
.transform(get_transforms(), tfm_y=True, size=size)
#Data augmentation? -> Standard transforms with tfm_y=True
.databunch(bs=bs, collate_fn=bb_pad_collate))
#Finally we convert to a DataBunch and we use bb_pad_collate
Where I think the bug is :
I’ve located the bug is in bb_pad_collate
function, in data.py
. More precisely, in the following code (part of the bb_pad_collate
function) :
for i,s in enumerate(samples):
imgs.append(s[0].data[None])
bbs, lbls = s[1].data
bboxes[i,-len(lbls):] = bbs
print('fuck')
labels[i,-len(lbls):] = tensor(lbls)
s[1].data
will sometimes be an empty tensor and thus the line bboxes[i, -len(lbls):] = bbs
will throw the error:
The expanded size of the tensor (4) must match the existing size (0) at non-singleton dimension 0. Target sizes: [4, 4]. Tensor sizes: [0, 4]
which isn’t expected behaviour I think.
Why is s[1].data
sometimes empty, I don’t know.
Where the bug originated from (in case my above diagnostic is completely wrong) :
See the end of my notebook in this gist, which is based on Sylvain’s dev notebook on Object Detection. The error is thrown during the training in the last cell.
Thank you !