Dynamic SSD implementation for fastai v1

rohitgeo · January 28, 2019, 7:43am

Hi all,

I and @divyansh have implemented a dynamic Single Shot Detector for fastai v1, based on part 2, Lesson 9 (pascal-multi).

The dev notebook at https://github.com/rohitgeo/singleshotdetector/blob/master/SingleShotDetector%20on%20Pascal.ipynb shares this implementation.

A simple 4x4 grid with one anchor box per grid cell can be created using
simple_ssd = SingleShotDetector(data, grids=[4], zooms=[1.0], ratios=[[1.0, 1.0]])

A full SSD can be created using
ssd = SingleShotDetector(data, grids=[4, 2, 1], zooms=[0.7, 1., 1.3], ratios=[[1., 1.], [1., 0.5], [0.5, 1.]])

The constructor allows specifying any number of grid sizes, zoom levels and aspect ratios for the anchor boxes and creates the appropriate network architecture.

Let us know if you’d like to see this in a PR.

Thanks,
Rohit

hud · January 30, 2019, 2:22am

Hello @rohitgeo, wonderful stuff you started here. I’ve been looking for a V1 implementation for object detection.

I am running through your notebook 1:1 but receive the error after ssd.lr_find()

  File "/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py", line 50, in bb_pad_collate
    bboxes[i,-len(lbls):] = bbs
RuntimeError: The expanded size of the tensor (13) must match the existing size (0) at non-singleton dimension 0.  Target sizes: [13, 4].  Tensor sizes: [0, 4]

Have you encountered this problem ?

rohitgeo · January 30, 2019, 12:00pm

Hi @hud, this could happen if there is an image that does not have any bboxes in it. A fix for this was recently added to fastai - see https://github.com/fastai/fastai/pull/1526/files

Otherwise, you could use our patched collate function instead:

def _bb_pad_collate(samples, pad_idx=0):
    "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
    arr = []
    for s in samples:
        try:
            arr.append(len(s[1].data[1]))
        except Exception as e:
            # set_trace()
            # print(s[1].data[1],s[1].data[1],e)
            arr.append(0)
    max_len = max(arr)
#    max_len = max([len(s[1].data[1]) for s in samples])
    bboxes = torch.zeros(len(samples), max_len, 4)
    labels = torch.zeros(len(samples), max_len).long() + pad_idx
    imgs = []
    for i,s in enumerate(samples):
        imgs.append(s[0].data[None])
        bbs, lbls = s[1].data
        # print(bbs, lbls)
        try:
            bboxes[i,-len(lbls):] = bbs
            labels[i,-len(lbls):] = lbls
        except Exception as e:
            pass
    return torch.cat(imgs,0), (bboxes,labels)

hud · February 1, 2019, 9:59pm

Hello @rohitgeo sorry I just got back and saw your message.

With your code _bb_pad_collate the bounding boxes were not displayed.

That fastai fix should be included with the latest fastai library right ? I still get the same error with the update. But adding the fastai fix directly in the notebook works:

def bb_pad_collate(samples:BatchSamples, pad_idx:int=0) -> Tuple[FloatTensor, Tuple[LongTensor, LongTensor]]:
    "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
    if isinstance(samples[0][1], int): return data_collate(samples)
    max_len = max([len(s[1].data[1]) for s in samples])
    bboxes = torch.zeros(len(samples), max_len, 4)
    labels = torch.zeros(len(samples), max_len).long() + pad_idx
    imgs = []
    for i,s in enumerate(samples):
        imgs.append(s[0].data[None])
        bbs, lbls = s[1].data
        if not (bbs.nelement() == 0):
            bboxes[i,-len(lbls):] = bbs
            labels[i,-len(lbls):] = tensor(lbls)
    return torch.cat(imgs,0), (bboxes,labels)

rohitgeo · February 2, 2019, 7:48am

@hud thanks for confirming that it works in the notebook. The fastai code is undergoing many changes - its a moving target… and some recent change is causing the bonding boxes to not show up. Our code was written with fastai 1.0.39 and should work with that.

vha14 · February 10, 2019, 5:01am

Nice implementation @rohitgeo! Are you planning to add common metrics such as mAP as well? It’d be super helpful to compute them and compare to results in the literature to gain confidence that the implementation is correct.

rohitgeo · February 12, 2019, 8:15am

Yes, we are working on mAP

vha14 · February 12, 2019, 7:24pm

Why do you normalize the bbox during loss computation: bbox = (bbox + 1.)/2.?

rohitgeo · February 14, 2019, 5:48am

This is being done to convert bbox from [-1,1] range to [0,1] range that the code expects.

vha14 · February 14, 2019, 3:31pm

Is it then also necessary to normalize the output from actn_to_bb (a_ic) in calculating the L1 loss?

I put in assertions for normalized bbox being non-negative, which passed, and non-normalized a_ic being non-negative, which failed.

def _ssd_1_loss(self, b_c, b_bb, bbox, clas, print_it=False):
        bbox,clas = self._get_y(bbox,clas)
        bbox = self._normalize_bbox(bbox)

        a_ic = self._actn_to_bb(b_bb, self._anchors, self._grid_sizes)
        overlaps = self._jaccard(bbox.data, self._anchor_cnr.data)
        try:
            gt_overlap,gt_idx = self._map_to_ground_truth(overlaps,print_it)
        except Exception as e:
            return 0.,0.
        gt_clas = clas[gt_idx]
        pos = gt_overlap > 0.4
        pos_idx = torch.nonzero(pos)[:,0]
        gt_clas[1-pos] = 0 #data.c - 1 # CHANGE
        gt_bbox = bbox[gt_idx]
        loc_loss = ((a_ic[pos_idx] - gt_bbox[pos_idx]).abs()).mean()

xnet · February 16, 2019, 12:54am

Has anyone tried other backbones? I’ve tried Resnet101 and got the following:

RuntimeError: Given groups=1, weight of size [256, 512, 3, 3], expected input[32, 2048, 7, 7] to have 512 channels, but got 2048 channels instead

Any advice on how to troubleshoot these error messages? Thanks!

vha14 · February 16, 2019, 2:47am

When you switched backbone, the shapes of the tensors across the forward propagation may change. I’d recommend looking closely at the interface between Resnet101’s last layers and the SSDHead first layers.

divyansh · February 17, 2019, 12:32pm

No it is not necessary! Because a_ic are computed from model’s output. So as long as we are normalizing the model’s input and training it on those normalized bboxes, it is good.

a_ic can be negative initially when model is not trained. Even after when the model is trained some elements of the a_ic variable can be negative as well as greater than the size of the image. This just simply means the bbox’s can also be outside of the image, which doesn’t do any harm. If you plot you’ll see some bboxes get’s clipped on the edges.

jccj · March 5, 2019, 11:03am

Thanks very much Rohit,

I successfully implemented your SSD with a different backbone and a custom dataset. However, I’m having problems exporting and then importing the learner object. If I call ssd.learn.export() I get this error even though I’m using the last pull that supposedly corrected it. Also, this learner doesn’t have the add test from folder function. How would you go about loading your trained model and running inference on a test folder? Many thanks,

JC

ritika1 · March 9, 2019, 9:01am

Thanks for the notebook.
I am able to implement SSD on a diffierent training dataset but i’m unable to make a prediction on unlabelled images.

mElabasiri · March 9, 2019, 10:33pm

Me too, I’ve trained my model for quite a bit now on different data which works great. but I’m having trouble usiing predict() on images that are not of training, validation. or test sets. I got this error when attempting to predict:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-3dd4dc20a925> in <module>
----> 1 test_prediction = ssd_model.learn.predict(test_images[0])

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, **kwargs)
    363             if norm.keywords.get('do_y',False): pred = self.data.denorm(pred)
    364         ds = self.data.single_ds
--> 365         pred = ds.y.analyze_pred(pred, **kwargs)
    366         out = ds.y.reconstruct(pred, ds.x.reconstruct(x[0])) if has_arg(ds.y.reconstruct, 'x') else ds.y.reconstruct(pred)
    367         return out, pred, res[0]

~/dev/.../ssdoil.py in analyze_pred(self, pred, thresh, nms_overlap, ssd)
     60     def analyze_pred(self, pred, thresh=0.5, nms_overlap=0.1, ssd=None):
     61         # def analyze_pred(pred, anchors, grid_sizes, thresh=0.5, nms_overlap=0.1, ssd=None):
---> 62         b_clas, b_bb = pred
     63         a_ic = ssd._actn_to_bb(b_bb, ssd._anchors.cpu(), ssd._grid_sizes.cpu())
     64         conf_scores, clas_ids = b_clas[:, 1:].max(1)

ValueError: not enough values to unpack (expected 2, got 1)

The problem seems to be in the SSDObjectCategoryList analyze_preds() method

vha14 · March 9, 2019, 11:53pm

predict is currently calling pred_batch, which is not supporting object detection yet. You need to use the learner’s model directly.

mElabasiri · March 10, 2019, 1:00am

Hi @vha14,

This is what I’m currently using ssd_model.learn.predict(test_images[0])
Can you clarify

Cause even if I bypass the analyze_preds() method I get a similar error from the default fastai analyze_preds() and when I debugged the returned preds I found out that they only have one dimension which is the class probability.

ritika1 · March 10, 2019, 5:57am

Were you able to predict in fastai 0.7 lesson 9 notebook? If yes then please tell me how to do so?

vha14 · March 10, 2019, 3:44pm

You can bypass the Learner object and work directly with its model, for example in loss_batch:

    out = model(*xb)

The output out should then have activations for both class probabilities and predicted bounding boxes. They can then be passed to analyze_pred as implemented in this notebook.