I working on a detection model based off the Pascal.ipynb in the course material. My data is comprised of small objects in a mostly empty field, so the RetinaNet model seems like a good fit. Right now I’m starting with a simple case with only 1 class in my images, and I’ll train more classes in as the data becomes available.
I’m trying to find the learning rate, but the RetinaNetFocalLoss function keeps crashing. The problem is one of the clas_tgt tensors getting passed into the function is all zeros, so the torch.min function is generating an error.
" in _unpad(failed resolving arguments)
23 print("bbox_tgts: “, bbox_tgt)
24 print(“self.pad_idx”, self.pad_idx)
—> 25 i = torch.min(torch.nonzero(clas_tgt-self.pad_idx))
26 return tlbr2cthw(bbox_tgt[i:]), clas_tgt[i:]-1+self.pad_idx
27
RuntimeError: invalid argument 1: cannot perform reduction function min on tensor with no elements because the operation does not have an identity at /opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THC/generic/THCTensorMathReduce.cu:64”
For debug I added some print statements on the bounding boxes targets and the class targets. I see these generating ‘good’ data for the first pass through, then the class target tensors is all zeros.
"output:
[tensor([[[-4., -4.],
[-4., -4.],
[-4., -4.],
…,
[-4., -4.],
[-4., -4.],
[-4., -4.]],
[[-4., -4.],
[-4., -4.],
[-4., -4.],
...,
[-4., -4.],
[-4., -4.],
[-4., -4.]],
[[-4., -4.],
[-4., -4.],
[-4., -4.],
...,
[-4., -4.],
[-4., -4.],
[-4., -4.]],
[[-4., -4.],
[-4., -4.],
[-4., -4.],
...,
[-4., -4.],
[-4., -4.],
[-4., -4.]]], device='cuda:0', grad_fn=<CatBackward>),
tensor([[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
...,
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
...,
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
...,
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]],
[[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
...,
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]]], device='cuda:0', grad_fn=<CatBackward>),
[[32, 32], [64, 64], [8, 8], [4, 4], [2, 2]]]
clas_tgts: tensor([1, 1, 1, 1], device=‘cuda:0’)
bbox_tgts: tensor([[-0.7116, 0.2994, -0.4689, 0.5388],
[ 0.1855, 0.7000, 0.4048, 0.9218],
[ 0.5571, -0.7715, 0.7984, -0.4500],
[-0.8330, 0.2960, -0.6692, 0.4765]], device=‘cuda:0’)
self.pad_idx 0
clas_tgts: tensor([0, 0, 1, 1], device=‘cuda:0’)
bbox_tgts: tensor([[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.5469, 0.7584, 0.6863, 1.0000],
[0.4406, 0.7306, 0.5344, 0.8733]], device=‘cuda:0’)
self.pad_idx 0
clas_tgts: tensor([1, 1, 1, 1], device=‘cuda:0’)
bbox_tgts: tensor([[-0.0200, -0.6015, 0.6444, -0.4047],
[-0.1858, -0.6790, 0.0828, -0.5157],
[ 0.5301, 0.5670, 0.7313, 0.7774],
[ 0.1561, 0.3476, 0.2968, 0.4721]], device=‘cuda:0’)
self.pad_idx 0
clas_tgts: tensor([0, 0, 1, 1], device=‘cuda:0’)
bbox_tgts: tensor([[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.4396, 0.2111, 0.7538, 0.4614],
[-0.1625, -0.1494, 0.1980, 0.1113]], device=‘cuda:0’)
self.pad_idx 0
output:
[tensor([[[-4.0000, -4.0000],
[-4.0000, -4.0000],
[-4.0000, -4.0000],
…,
[-4.0000, -4.0000],
[-4.0000, -4.0000],
[-4.0000, -4.0000]],
[[-3.9999, -4.0000],
[-3.9999, -4.0000],
[-3.9999, -4.0000],
...,
[-4.0000, -4.0000],
[-4.0000, -4.0000],
[-4.0000, -4.0000]],
[[-3.9999, -4.0000],
[-3.9999, -4.0000],
[-3.9999, -4.0000],
...,
[-4.0000, -4.0000],
[-4.0000, -4.0000],
[-4.0000, -4.0000]],
[[-3.9999, -4.0000],
[-3.9999, -4.0000],
[-3.9999, -4.0000],
...,
[-4.0000, -4.0000],
[-4.0000, -4.0000],
[-4.0000, -4.0000]]], device='cuda:0', grad_fn=<CatBackward>),
tensor([[[ 5.8823e-05, 3.8971e-05, 5.0638e-05, -6.1263e-05],
[ 4.3368e-05, 4.2452e-05, 5.9110e-05, -5.6631e-05],
[ 2.3543e-05, -1.8120e-06, 6.0415e-05, -5.9552e-05],
...,
[ 9.0789e-07, 6.5124e-06, -9.2335e-06, 8.7830e-06],
[ 5.7327e-07, -3.7689e-06, -6.4236e-06, 8.1218e-06],
[-3.6391e-06, 6.1791e-06, -7.5774e-06, 7.5879e-06]],
[[ 6.3462e-05, 3.8431e-05, 5.2794e-05, -6.6446e-05],
[ 4.4774e-05, 4.5052e-05, 6.3032e-05, -5.9442e-05],
[ 1.7675e-05, -1.4633e-06, 6.6049e-05, -6.6170e-05],
...,
[ 8.7837e-07, 6.7707e-06, -9.4765e-06, 9.1426e-06],
[ 3.7155e-07, -3.9938e-06, -6.7399e-06, 8.4144e-06],
[-4.1812e-06, 6.7351e-06, -7.7293e-06, 7.7419e-06]],
[[ 6.3581e-05, 3.9795e-05, 5.4961e-05, -6.7096e-05],
[ 4.5167e-05, 4.3474e-05, 6.3230e-05, -6.1343e-05],
[ 1.9188e-05, -3.1147e-06, 6.7213e-05, -6.6160e-05],
...,
[ 8.4529e-07, 8.3856e-06, -1.1132e-05, 1.0754e-05],
[-2.4030e-07, -4.5468e-06, -8.1089e-06, 1.0164e-05],
[-4.8756e-06, 7.5789e-06, -9.4542e-06, 9.0894e-06]],
[[ 6.5023e-05, 4.1934e-05, 5.4422e-05, -6.7103e-05],
[ 4.9524e-05, 4.6245e-05, 6.4738e-05, -6.2913e-05],
[ 2.1542e-05, -4.6008e-06, 6.6489e-05, -6.6718e-05],
...,
[ 6.3114e-07, 6.1879e-06, -8.6072e-06, 8.1754e-06],
[ 5.6170e-07, -3.4561e-06, -6.0266e-06, 7.6720e-06],
[-3.6853e-06, 5.9568e-06, -7.2529e-06, 6.8260e-06]]],
device='cuda:0', grad_fn=<CatBackward>),
[[32, 32], [64, 64], [8, 8], [4, 4], [2, 2]]]
clas_tgts: tensor([0, 1, 1], device=‘cuda:0’)
bbox_tgts: tensor([[ 0.0000, 0.0000, 0.0000, 0.0000],
[-0.6563, 0.3660, 0.0242, 0.8674],
[ 0.0959, 0.4811, 0.3723, 0.8862]], device=‘cuda:0’)
self.pad_idx 0
clas_tgts: tensor([0, 0, 0], device=‘cuda:0’)
bbox_tgts: tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], device=‘cuda:0’)
self.pad_idx 0"
I’ve spent the last few days trying to go through my data set to ensure I don’t have any images with no objects in them. And I’ve attempted to modify the RetniaNet model, and Loss Function parameters a bit to see if this will change anything. Specifically I changed alpha to be 0.1 as many images only have a small object in them, and I changed gamma to 5.0, as the Lin/Goyal/Girshick/He/Dollar paper recommended.
Does anyone here have any suggestions on what I might look next? I’m pretty sure my issue is with my input data, but I don’t know what to look for next.