Inference Tutorial error

I am having some trouble with getting inference to work with my multi bounding box image data, on which I have trained RetinaNet.

I am using the inference tutorial as my guide. I have a test set of unlabelled data. To allow me to use the same DataBunch structure, I give the unlabelled data some dummy bounding boxes.

  1. Load the data:
size = 256    
data = (ObjectItemList.from_df(path=PATH, folder='ground_truth_images', df=grouped_inf)
            .split_by_rand_pct()                          
            .label_from_df()
            .transform(get_transforms(max_rotate=5, max_zoom=1.05), size=size, resize_method=ResizeMethod.SQUISH)
            .databunch(bs=64, collate_fn=bb_pad_collate)
            .normalize(imagenet_stats)
            )
  1. I have previously trained my network, saved and exported it, so I load it:
learn = load_learner('/rds/user/trpb2/hpc-work/data/ground_truth')
  1. and then try to predict for an image:
img = data.train_ds[0][0]
learn.predict(img)

and I get the error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-29-ef47b9ae205b> in <module>
      1 img = data.train_ds[0][0]
----> 2 learn.predict(img)
      3 #preds = learn.pred_batch(DatasetType.Train)

~/.conda/envs/fastai_v1_2/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, return_x, batch_first, with_dropout, **kwargs)
    365         batch = self.data.one_item(item)
    366         res = self.pred_batch(batch=batch, with_dropout=with_dropout)
--> 367         raw_pred,x = grab_idx(res,0,batch_first=batch_first),batch[0]
    368         norm = getattr(self.data,'norm',False)
    369         if norm:

~/.conda/envs/fastai_v1_2/lib/python3.7/site-packages/fastai/torch_core.py in grab_idx(x, i, batch_first)
    325 def grab_idx(x,i,batch_first:bool=True):
    326     "Grab the `i`-th batch in `x`, `batch_first` stating the batch dimension."
--> 327     if batch_first: return ([o[i].cpu() for o in x]   if is_listy(x) else x[i].cpu())
    328     else:           return ([o[:,i].cpu() for o in x] if is_listy(x) else x[:,i].cpu())
    329 

~/.conda/envs/fastai_v1_2/lib/python3.7/site-packages/fastai/torch_core.py in <listcomp>(.0)
    325 def grab_idx(x,i,batch_first:bool=True):
    326     "Grab the `i`-th batch in `x`, `batch_first` stating the batch dimension."
--> 327     if batch_first: return ([o[i].cpu() for o in x]   if is_listy(x) else x[i].cpu())
    328     else:           return ([o[:,i].cpu() for o in x] if is_listy(x) else x[:,i].cpu())
    329 

AttributeError: 'list' object has no attribute 'cpu'

I’d be really grateful for any help!

I have been trying to do the same thing and am running into the same issue. I can’t seem to find anything that describes how to get predictions on a single image!

Hi, are you also using RetinaNet for object detection? I have been using code based on the work of @Bronzi88

I think the problem is caused by a list of anchor boxes being returned with the tensor of results, which then fails with the .cpu() command. The output looks like this and you can see the list at the end:

[tensor([[[ -7.5396,  -4.3380,  -6.7723,  -7.0988],
          [ -7.0996,  -4.3502,  -6.5561,  -7.0988],
          [ -7.1233,  -4.3622,  -6.0103,  -7.0988],
          ...,
          [ -6.3839,  -4.8364,  -6.4581, -11.5488],
          [ -6.2477,  -5.0400,  -7.6089, -11.5488],
          [ -6.1851,  -5.1004,  -7.1916, -11.5488]],
 
         [[ -8.9492,  -4.4553,  -8.0759,  -8.4949],
          [ -9.0397,  -4.5483,  -8.3623,  -8.4949],
          [ -9.4299,  -4.5873,  -7.4108,  -8.4949],
          ...,
          [ -6.5616,  -5.0501,  -6.6564, -11.7468],
          [ -6.3625,  -5.3129,  -7.8476, -11.7468],
          [ -6.3518,  -5.4153,  -7.3417, -11.7468]],
 
         [[ -9.8071,  -4.4826,  -8.1463,  -9.0904],
          [ -8.0049,  -4.4909,  -7.8769,  -9.0904],
          [ -7.7718,  -4.5404,  -7.4979,  -9.0904],
          ...,
          [ -5.4136,  -4.3080,  -6.4343, -22.6544],
          [ -6.1541,  -5.2840,  -7.8605, -22.6544],
          [ -6.8810,  -6.3347,  -8.3685, -22.6543]],
 
         ...,
 
         [[ -9.9924,  -4.5532,  -8.4740,  -9.4285],
          [ -8.0902,  -4.4867,  -8.1060,  -9.4285],
          [ -7.9398,  -4.5128,  -7.7147,  -9.4285],
          ...,
          [ -6.9336,  -5.1295,  -7.1565, -13.8359],
          [ -6.6565,  -5.4083,  -8.6254, -13.8359],
          [ -6.6014,  -5.4912,  -8.0543, -13.8358]],
 
         [[-10.2256,  -4.4160,  -8.8093,  -9.9710],
          [ -9.1475,  -4.5032,  -9.0893,  -9.9710],
          [ -9.3177,  -4.5537,  -8.3028,  -9.9710],
          ...,
          [ -6.6915,  -5.0426,  -6.7209, -12.1579],
          [ -6.5324,  -5.2063,  -8.0080, -12.1579],
          [ -6.3788,  -5.2105,  -7.4859, -12.1579]],
 
         [[-10.5497,  -4.5839,  -8.8586,  -9.9287],
          [ -8.3847,  -4.5479,  -8.6110,  -9.9287],
          [ -8.2185,  -4.6025,  -8.1405,  -9.9287],
          ...,
          [ -7.1050,  -5.0200,  -7.3463, -14.1946],
          [ -6.8236,  -5.3346,  -9.0371, -14.1946],
          [ -6.7305,  -5.3823,  -8.3779, -14.1946]]], grad_fn=<CatBackward>),
 tensor([[[-0.2722,  0.1307,  0.2233, -0.4304],
          [-0.3768,  0.0959,  0.0812, -0.5624],
          [-0.5212,  0.0669, -0.0477, -0.6232],
          ...,
          [-0.1585,  0.1601, -0.7502, -0.3160],
          [-0.2419,  0.1282, -0.6435,  0.0774],
          [-0.3947, -0.2535, -0.6461,  0.0532]],
 
         [[-0.1951, -0.0568, -0.1130, -0.3774],
          [-0.1101, -0.0579, -0.0886, -0.3734],
          [-0.2134,  0.0166, -0.1101, -0.3274],
          ...,
          [-0.2245,  0.4208, -0.6255, -0.3865],
          [-0.2679,  0.1819, -0.6122, -0.0309],
          [-0.3431, -0.2476, -0.7174, -0.0358]],
 
         [[-0.2055,  0.0193,  0.1294, -0.6977],
          [-0.2467, -0.0057,  0.0315, -0.7113],
          [-0.3814, -0.0375, -0.0420, -0.6652],
          ...,
          [ 0.1730,  0.6576, -2.3269, -1.0847],
          [ 0.1596,  0.1691, -2.0170, -0.2714],
          [-0.0173, -0.4099, -1.7699, -0.2557]],
 
         ...,
 
         [[-0.4692,  0.1189,  0.0629, -0.5746],
          [-0.4754,  0.1137,  0.0417, -0.6018],
          [-0.5233,  0.0807,  0.0040, -0.5531],
          ...,
          [-0.1619,  0.4475, -0.8763, -0.4809],
          [-0.2378,  0.3084, -0.7060,  0.0618],
          [-0.3560, -0.2792, -0.7296,  0.0984]],
 
         [[ 0.0150,  0.0798,  0.1737, -0.5955],
          [-0.0415, -0.0106,  0.0674, -0.6122],
          [-0.1825, -0.0797,  0.0054, -0.5983],
          ...,
          [-0.2121,  0.2160, -0.7853, -0.4600],
          [-0.3484,  0.2896, -0.6143, -0.0401],
          [-0.5152, -0.0658, -0.5366, -0.0447]],
 
         [[-0.4829,  0.1223,  0.0229, -0.6683],
          [-0.4379,  0.0735, -0.0154, -0.6769],
          [-0.4789,  0.0103, -0.0690, -0.6212],
          ...,
          [-0.3785,  0.3118, -0.9263, -0.3789],
          [-0.4258,  0.1677, -0.7093, -0.0613],
          [-0.5188, -0.2564, -0.6806, -0.0508]]], grad_fn=<CatBackward>),
 [[32, 32], [16, 16], [8, 8], [4, 4]]]

Yes, I am using RetinaNet. I am going off the pascal notebook with my own dataset. The original notebook can be found here: https://github.com/fastai/course-v3/blob/master/nbs/dl2/pascal.ipynb

I have been able to view predictions on images trained in this way, by using the ‘show_preds’ function in conjunction with the functions it builds upon. So, instead of using learn.predict(img), you use show_preds(img, output, idx).

This works and I get pretty decent results. The only issue is that there are only predictions for 16 images for some reason. I am trying to figure out how to get predictions on new images, but still can’t seem to find anything.

I actually have the same probelm and I don’t know whats going on there.

Has anyone figured out how to reuse it yet?

in case someone gets here - there is a Kaggle notebook that bypass that issue - https://www.kaggle.com/ianmoone0617/fastai-v1-global-wheat-detection-tutorial

i’ve used that approach to predict evaluation images

1 Like