First of all, thank you very much for the wonderful course, @jeremy!
I studied 12 lessons currently. It really made me understand deep learning at some level, while other videos and articles did not help.
I found some notebooks that were not covered in the course-v3. For example, Object Detection notebook pascal.ipynb.
I already found the topic, where @muellerzr says that Object Detection will be a separate course.
But I thought, I can learn it by myself (with a little help of the community maybe). I really want to make a project with object detection to detect cards of game “Set”. And also I have some more ideas for deep learning projects in my mind.
So, I tried understanding (not up to the end yet) and running pascal.ipynb.
It fails at first lr_find()
:
in _unpad(self, bbox_tgt, clas_tgt) 21 print("clas_tgt: ", clas_tgt) 22 print("self.pad_idx: ", self.pad_idx) ---> 23 i = torch.min(torch.nonzero(clas_tgt-self.pad_idx)) 24 return tlbr2cthw(bbox_tgt[i:]), clas_tgt[i:]-1+self.pad_idx 25 RuntimeError: invalid argument 1: cannot perform reduction function min on tensor with no elements because the operation does not have an identity at /pytorch/aten/src/THC/generic/THCTensorMathReduce.cu:64
As you can see above, I created debug output for clas_tgt
and pad_idx
.
Found that it crashes when there are only zeros in clas_tgt
.
clas_tgt: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0') self.pad_idx: 0
And that is clear, why it fails: torch.nonzero
returns empty tensor, torch.min
can’t handle it.
I started thinking, how does it happen, that there is clas_tgt
with zeros only sometimes. I’m not yet completely understanding the notebook, but I guessed that there are some images without bboxes coming here. So I decided to check the databunch and found that there are some images with no bboxes.
Here is my dirty way to check that fact
data = get_data(1,128) i = 0 for smth in data.train_dl.dl: #print(smth[1][0].shape) num_objs = smth[1][0].shape[1] i += 1 print(i, num_objs) if num_objs == 0: print(smth[0].shape) print(smth[0].squeeze(0).shape) show_image(smth[0].squeeze(0)) assert(num_objs > 0)
I spent some more time and found that all original images have bboxes, but it may happen, that there are no bboxes on visible area of some image after this line:
src = src.transform(get_transforms(), size=size, tfm_y=True)
Even if I remove transforms: src = src.transform(size=size, tfm_y=True)
For example, this image just becomes cropped to square and loses bbox:
I decided to filter images that lose bboxes after transform. I wrote this line of code after transform:
src = src.filter_by_func(lambda x, y: len(y[0]) == 0)
And suddenly found a bug in fastai v1. get_data
failed with this error on next line (creating databunch): AttributeError: 'ObjectCategoryList' object has no attribute 'pad_idx'
Even print(src)
after filtering caused such error.
After a few hours of debugging, I found that this line of code in data_block.py loses y.pad_idx
.
Proof
Monkey patching LabelList.filter_by_func
:
def filter_by_func(self, func:Callable): filt = array([func(x,y) for x,y in zip(self.x.items, self.y.items)]) self.x = self.x[~filt] print('before: ', 'pad_idx' in vars(self.y)) self.y = self.y[~filt] print('after: ', 'pad_idx' in vars(self.y)) return self LabelList.filter_by_func = filter_by_func
Results:
before: True after: False before: True after: False
I do not know how to fix this correctly. So I created a dirty monkey-patch fix for temporary usage:
Temporary fix
Monkey patch, only for object detection dataset.
def filter_by_func(self, func:Callable): filt = array([func(x,y) for x,y in zip(self.x.items, self.y.items)]) self.x = self.x[~filt] pad_idx = self.y.pad_idx # save pad_idx self.y = self.y[~filt] self.y.pad_idx = pad_idx # set pad_idx return self LabelList.filter_by_func = filter_by_func
This helped me to run get_data
without crashes.
But the filtering didn’t help, because it did not affect sizes of train and validation datasets. It seems, that removing of bboxes, that became invisible after transforms (bboxes, which became out of image), are done while creating databunch.
Right now I’m too tired of debugging and decided to share my experience, tell about the bug in fastai v1 and ask for some help.
Thank you!