Dynamic SSD implementation for fastai v1

Hi all,

I and @divyansh have implemented a dynamic Single Shot Detector for fastai v1, based on part 2, Lesson 9 (pascal-multi).

The dev notebook at https://github.com/rohitgeo/singleshotdetector/blob/master/SingleShotDetector%20on%20Pascal.ipynb shares this implementation.

A simple 4x4 grid with one anchor box per grid cell can be created using
simple_ssd = SingleShotDetector(data, grids=[4], zooms=[1.0], ratios=[[1.0, 1.0]])

A full SSD can be created using
ssd = SingleShotDetector(data, grids=[4, 2, 1], zooms=[0.7, 1., 1.3], ratios=[[1., 1.], [1., 0.5], [0.5, 1.]])

The constructor allows specifying any number of grid sizes, zoom levels and aspect ratios for the anchor boxes and creates the appropriate network architecture.

Let us know if you’d like to see this in a PR.

Thanks,
Rohit

20 Likes

Hello @rohitgeo, wonderful stuff you started here. I’ve been looking for a V1 implementation for object detection.

I am running through your notebook 1:1 but receive the error after ssd.lr_find()

  File "/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py", line 50, in bb_pad_collate
    bboxes[i,-len(lbls):] = bbs
RuntimeError: The expanded size of the tensor (13) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [13, 4].  Tensor sizes: [0, 4]

Have you encountered this problem ?

Hi @hud, this could happen if there is an image that does not have any bboxes in it. A fix for this was recently added to fastai - see https://github.com/fastai/fastai/pull/1526/files

Otherwise, you could use our patched collate function instead:

def _bb_pad_collate(samples, pad_idx=0):
    "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
    arr = []
    for s in samples:
        try:
            arr.append(len(s[1].data[1]))
        except Exception as e:
            # set_trace()
            # print(s[1].data[1],s[1].data[1],e)
            arr.append(0)
    max_len = max(arr)
#    max_len = max([len(s[1].data[1]) for s in samples])
    bboxes = torch.zeros(len(samples), max_len, 4)
    labels = torch.zeros(len(samples), max_len).long() + pad_idx
    imgs = []
    for i,s in enumerate(samples):
        imgs.append(s[0].data[None])
        bbs, lbls = s[1].data
        # print(bbs, lbls)
        try:
            bboxes[i,-len(lbls):] = bbs
            labels[i,-len(lbls):] = lbls
        except Exception as e:
            pass
    return torch.cat(imgs,0), (bboxes,labels)

Hello @rohitgeo sorry I just got back and saw your message.

With your code _bb_pad_collate the bounding boxes were not displayed.

That fastai fix should be included with the latest fastai library right ? I still get the same error with the update. But adding the fastai fix directly in the notebook works:

def bb_pad_collate(samples:BatchSamples, pad_idx:int=0) -> Tuple[FloatTensor, Tuple[LongTensor, LongTensor]]:
    "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
    if isinstance(samples[0][1], int): return data_collate(samples)
    max_len = max([len(s[1].data[1]) for s in samples])
    bboxes = torch.zeros(len(samples), max_len, 4)
    labels = torch.zeros(len(samples), max_len).long() + pad_idx
    imgs = []
    for i,s in enumerate(samples):
        imgs.append(s[0].data[None])
        bbs, lbls = s[1].data
        if not (bbs.nelement() == 0):
            bboxes[i,-len(lbls):] = bbs
            labels[i,-len(lbls):] = tensor(lbls)
    return torch.cat(imgs,0), (bboxes,labels)

@hud thanks for confirming that it works in the notebook. The fastai code is undergoing many changes - its a moving target… and some recent change is causing the bonding boxes to not show up. Our code was written with fastai 1.0.39 and should work with that.

Nice implementation @rohitgeo! Are you planning to add common metrics such as mAP as well? It’d be super helpful to compute them and compare to results in the literature to gain confidence that the implementation is correct.

Yes, we are working on mAP

2 Likes

Why do you normalize the bbox during loss computation: bbox = (bbox + 1.)/2.?

This is being done to convert bbox from [-1,1] range to [0,1] range that the code expects.

Is it then also necessary to normalize the output from actn_to_bb (a_ic) in calculating the L1 loss?

I put in assertions for normalized bbox being non-negative, which passed, and non-normalized a_ic being non-negative, which failed.

def _ssd_1_loss(self, b_c, b_bb, bbox, clas, print_it=False):
        bbox,clas = self._get_y(bbox,clas)
        bbox = self._normalize_bbox(bbox)

        a_ic = self._actn_to_bb(b_bb, self._anchors, self._grid_sizes)
        overlaps = self._jaccard(bbox.data, self._anchor_cnr.data)
        try:
            gt_overlap,gt_idx = self._map_to_ground_truth(overlaps,print_it)
        except Exception as e:
            return 0.,0.
        gt_clas = clas[gt_idx]
        pos = gt_overlap > 0.4
        pos_idx = torch.nonzero(pos)[:,0]
        gt_clas[1-pos] = 0 #data.c - 1 # CHANGE
        gt_bbox = bbox[gt_idx]
        loc_loss = ((a_ic[pos_idx] - gt_bbox[pos_idx]).abs()).mean()

Has anyone tried other backbones? I’ve tried Resnet101 and got the following:

RuntimeError: Given groups=1, weight of size [256, 512, 3, 3], expected input[32, 2048, 7, 7] to have 512 channels, but got 2048 channels instead

Any advice on how to troubleshoot these error messages? Thanks!

When you switched backbone, the shapes of the tensors across the forward propagation may change. I’d recommend looking closely at the interface between Resnet101’s last layers and the SSDHead first layers.

No it is not necessary! Because a_ic are computed from model’s output. So as long as we are normalizing the model’s input and training it on those normalized bboxes, it is good.

a_ic can be negative initially when model is not trained. Even after when the model is trained some elements of the a_ic variable can be negative as well as greater than the size of the image. This just simply means the bbox’s can also be outside of the image, which doesn’t do any harm. If you plot you’ll see some bboxes get’s clipped on the edges.

1 Like

Thanks very much Rohit,

I successfully implemented your SSD with a different backbone and a custom dataset. However, I’m having problems exporting and then importing the learner object. If I call ssd.learn.export() I get this error even though I’m using the last pull that supposedly corrected it. Also, this learner doesn’t have the add test from folder function. How would you go about loading your trained model and running inference on a test folder? Many thanks,

JC

1 Like

Thanks for the notebook.
I am able to implement SSD on a diffierent training dataset but i’m unable to make a prediction on unlabelled images.

1 Like

Me too, I’ve trained my model for quite a bit now on different data which works great. but I’m having trouble usiing predict() on images that are not of training, validation. or test sets. I got this error when attempting to predict:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-3dd4dc20a925> in <module>
----> 1 test_prediction = ssd_model.learn.predict(test_images[0])

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, **kwargs)
    363             if norm.keywords.get('do_y',False): pred = self.data.denorm(pred)
    364         ds = self.data.single_ds
--> 365         pred = ds.y.analyze_pred(pred, **kwargs)
    366         out = ds.y.reconstruct(pred, ds.x.reconstruct(x[0])) if has_arg(ds.y.reconstruct, 'x') else ds.y.reconstruct(pred)
    367         return out, pred, res[0]

~/dev/.../ssdoil.py in analyze_pred(self, pred, thresh, nms_overlap, ssd)
     60     def analyze_pred(self, pred, thresh=0.5, nms_overlap=0.1, ssd=None):
     61         # def analyze_pred(pred, anchors, grid_sizes, thresh=0.5, nms_overlap=0.1, ssd=None):
---> 62         b_clas, b_bb = pred
     63         a_ic = ssd._actn_to_bb(b_bb, ssd._anchors.cpu(), ssd._grid_sizes.cpu())
     64         conf_scores, clas_ids = b_clas[:, 1:].max(1)

ValueError: not enough values to unpack (expected 2, got 1)

The problem seems to be in the SSDObjectCategoryList analyze_preds() method

1 Like

predict is currently calling pred_batch, which is not supporting object detection yet. You need to use the learner’s model directly.

Hi @vha14,

This is what I’m currently using ssd_model.learn.predict(test_images[0])
Can you clarify

Cause even if I bypass the analyze_preds() method I get a similar error from the default fastai analyze_preds() and when I debugged the returned preds I found out that they only have one dimension which is the class probability.

Were you able to predict in fastai 0.7 lesson 9 notebook? If yes then please tell me how to do so?

You can bypass the Learner object and work directly with its model, for example in loss_batch:

    out = model(*xb)

The output out should then have activations for both class probabilities and predicted bounding boxes. They can then be passed to analyze_pred as implemented in this notebook.

3 Likes