Object detection in fast.ai v1

champs.jaideep · May 29, 2019, 11:16am

ok in the pascal nbk we dont have any metric defined ??

dhoa · May 29, 2019, 11:32am

What do you mean by pascal nbk ? I found this in his Cocotiny nb:

In [15]:

voc = PascalVOCMetric(anchors, size, [i for i in data.train_ds.y.classes[1:]])
learn = Learner(data, model, loss_func=crit, callback_fns=[ShowGraph, BBMetrics],
metrics=[voc])

champs.jaideep · May 29, 2019, 11:37am

i was talking about this fai notebook which is in dev phase

github.com

fastai/fastai_docs/blob/master/dev_course/dl2/pascal.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai.vision import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data"
   ]
  },
  {
   "cell_type": "markdown",

This file has been truncated. show original

In this notebook are they using above metrics or any other metrics?

dhoa · May 29, 2019, 12:03pm

I haven’t run this nb yet. But at the end I found mAP. I don’t know if they show any others metrics during learning.

Bronzi88 · May 29, 2019, 12:34pm

Hi,

the models return three thinks:

[1, 24480, 3] -> [Batchsize, number of boxes, classes + background] -> classification results for each anchor
[1, 24480, 4] -> [Batchsize, number of boxes, box-coordinates refinements] -> how the anchors should be resized to match the objects in the image
Feature map shapes. You can ignore that for a start is more for debug purposes

For inference please take a look at the following colab notebook:
https://colab.research.google.com/drive/16un8u65D4oTWiWzpdjMU_cpnVasaQ72n

or this function in the repo:

github.com

ChristianMarzahl/ObjectDetection/blob/6fc3ca7ae2197fed05a47d6a726d46ac4ae41c11/helper/object_detection_helper.py#L172




def show_preds(img, bbox_pred, preds, scores, classes, figsize=(5,5)):


    _, ax = plt.subplots(1, 1, figsize=figsize)
    for bbox, c, scr in zip(bbox_pred, preds, scores):
        img.show(ax=ax)
        txt = str(c.item()) if classes is None else classes[c.item()+1]
        draw_rect(ax, [bbox[1],bbox[0],bbox[3],bbox[2]], text=f'{txt} {scr:.2f}')




def show_results_side_by_side(learn: Learner, anchors, detect_thresh:float=0.2, nms_thresh: float=0.3,  image_count: int=5):


    with torch.no_grad():
        img_batch, target_batch = learn.data.one_batch(DatasetType.Valid, False, False, False)


        prediction_batch = learn.model(img_batch[:image_count])
        class_pred_batch, bbox_pred_batch = prediction_batch[:2]


        bbox_gt_batch, class_gt_batch = target_batch[0][:image_count], target_batch[1][:image_count]


        for img, bbox_gt, class_gt, clas_pred, bbox_pred in list(

dhoa · May 29, 2019, 12:46pm

Great !! Thank you so much

champs.jaideep · May 29, 2019, 1:06pm

THanks christian
I wanted to ask few more things related to metrics and bbox coordinates

I am using RSNA kaggle competition dataset where they have provided bbox in this order as x, y width height
Do i have to do swap the x y and width & height here so that is why m doing as below
train_df[‘bbox’]= train_df.loc[:,[‘y’,‘x’,‘height’,‘width’]].values.tolist() # x ,y,width ,height
train_df[‘bbox’]= train_df[‘bbox’].apply(lambda x:[[x[0],x[1],x[2]-x[0]+1,x[3]-x[1]+1]])

I am having a doubt if bbx are drawn correctly or not when i see in data.show_batch so wanted to check if i m passing coordinates correctly to API.

In the above problem we are doing binary classification 0 and 1 so will this metric work for this ?
what does this metric tries to evaluate ?
PascalVOCMetric(anchors, size, [i for i in data.train_ds.y.classes[1:]])

Thanks in advance for your time

Bronzi88 · May 30, 2019, 11:06am

Moin,

sorry I have problems understanding your questions.

1-2) Are the boxes correctly drawn at schow_batch?

Yes that would work to evaluate the mAP at train time on the validation dataset for the one class you have.

With kind regards,
Christian

champs.jaideep · June 2, 2019, 8:13am

By looking at the picture i had doubt for that reason i wanted to take one image for which i can draw bbox using rectangle patch and compare that to show batch .
But problem is show batch takes pics at random every time.
Any alternate suggestion to cross verify ?
I tried using below means.But i get an error message show has no param y.
before calling BBoxcreate, i read the image using pydicom (Since img format is .dcm and it stores numpy array in field ) and converted numpy array to PIL Img object before passing it to method.

img = open_image(path/'train'/train_images[1])
bbox = ImageBBox.create(*img.size, train_lbl_bbox[1][0], [0, 1], classes=['person', 'horse'])
img.show(figsize=(6,4), y=bbox)

faib · June 3, 2019, 2:39pm

Thanks for your notebook @Bronzi88! I tried to recreate your success with the SVHN dataset, however I am stuck when it comes the training part. As soon as I run learn.recorder.plot() or learn.fit_one_cycle() it throws the following error, which I don’t know how to handle:

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 3

Are my images the wrong size? I resized them to be 48x48. Are the anchors set wrong?

Bronzi88 · June 4, 2019, 12:28pm

Hi (Moin),

the minimal supported image size is 256x256 at least until now.
But that would lead to a different error message.

Can you post the notebook or some code? Then I could try to replicate the error.

With kind regards,
Christian

Bronzi88 · June 4, 2019, 12:58pm

Hi (Moin),

the following code should work.

img = open_image(coco/ 'train_sample' / images[0])
image_boxes = img2bbox[images[0]][0] #[[x1,y1,x2,y2], [x1,y1,x2,y2]] 
bbox = ImageBBox.create(*img.size, bboxes=image_boxes, labels=[0,0,0], classes=['TV'])
img.show(figsize=(6,4), y=bbox)
plt.show()

With kind regards,
Christian

KarlH · June 5, 2019, 4:34am

I’m trying to adapt the RetinaNet notebook to a dataset that contains empty images with no objects. Does anyone know how I might go about adapting the current notebook to accommodate this?

Right now the notebook as is works fine when I use the subset of the data that contains images, so everything is good there. When I try to add in images without labels, I get errors. Specifically the create method of the ImageBBox class tries to index into an empty list.

Does anyone know a good format for representing empty images in bounding box format? I’ve considered adding some dummy coordinates like [0, 0, 0, 0] but I’m concerned that will have weird effects on training.

Bronzi88 · June 5, 2019, 6:44am

Hi (Moin),

if you use [0, 0, 0, 0] with the background class as the label that should be work just fine.

With kind regards,
Christian

KarlH · June 5, 2019, 7:35am

Gave this a try. It causes some sort of numerical issue. When I create a dataloader, I get a ton of UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input

When I try to call data.show_batch(rows=3), the kernel dies. The issue repeats.

MicPie · June 5, 2019, 8:28am

Maybe extending the bounding box for the background to the entire picture area helps to avoid this problem?

Bronzi88 · June 5, 2019, 2:22pm

HI,

in case you don’t have any labels

def get_y_func(o):
    if str(o.id) in img2bbox: #Labels?
        return img2bbox[str(o.id)]
    else: #No labels?
        return [[[0,0,0,0]], ['background']]

works just fine for me.

KarlH · June 6, 2019, 1:39am

This does work for dealing with empty images. You do end up with an extra class (‘background’ is listed twice in data.classes and data.c is 3, but monkey patching these seems to work.

The dead kernel thing is still an issue, which is weird. It seems like if the dataloader gets a batch of empty images, something goes wrong. I’m doing this on Windows 10 so this could be an OS specific issue. My first thought was maybe it’s a multiprocessing thing, so I set num_workers=1 but that didn’t solve it. I’m not really sure how to troubleshoot because things go from fine to dead kernel without any kind of error.

Might spin up an Ubuntu server and see if the issue persists there.

Bronzi88 · June 6, 2019, 12:28pm

Moin,

to deactivate multiprocessing, you have to set num_workers to zero. At least if I remember correctly.

'background’ is listed twice. Okay, that is not what I intended. But the fastai method generate_classes add [‘background’] anyway and do not check if its already in the list of classes. Sorry about that.

github.com

fastai/fastai/blob/43001e17ba469308e9688dfe99a891018bcf7ad4/fastai/vision/data.py#L335


        super().__init__(ds)
        self.pad_idx = pad_idx
        self.state_attrs.append('pad_idx')


    def process(self, ds:ItemList):
        ds.pad_idx = self.pad_idx
        super().process(ds)


    def process_one(self,item): return [item[0], [self.c2i.get(o,None) for o in item[1]]]


    def generate_classes(self, items):
        "Generate classes from unique `items` and add `background`."
        classes = super().generate_classes([o[1] for o in items])
        classes = ['background'] + list(classes)
        return classes


def _get_size(xs,i):
    size = xs.sizes.get(i,None)
    if size is None:
        # Image hasn't been accessed yet, so we don't know its size
        _ = xs[i]

But you can easily fix that by overwriting the method generate_classes from the class ObjectCategoryProcessor.

github.com

fastai/fastai/blob/43001e17ba469308e9688dfe99a891018bcf7ad4/fastai/vision/data.py#L322


            axs = subplots(rows, rows, imgsize=imgsize, figsize=figsize, title=title, weight='bold', size=12)
            for x,y,z,ax in zip(xs,ys,zs,axs.flatten()): x.show(ax=ax, title=f'{str(y)}\n{str(z)}', **kwargs)
            for ax in axs.flatten()[len(xs):]: ax.axis('off')
        else:
            title = 'Ground truth/Predictions'
            axs = subplots(len(xs), 2, imgsize=imgsize, figsize=figsize, title=title, weight='bold', size=14)
            for i,(x,y,z) in enumerate(zip(xs,ys,zs)):
                x.show(ax=axs[i,0], y=y, **kwargs)
                x.show(ax=axs[i,1], y=z, **kwargs)


class ObjectCategoryProcessor(MultiCategoryProcessor):
    "`PreProcessor` for labelled bounding boxes."
    def __init__(self, ds:ItemList, pad_idx:int=0):
        super().__init__(ds)
        self.pad_idx = pad_idx
        self.state_attrs.append('pad_idx')


    def process(self, ds:ItemList):
        ds.pad_idx = self.pad_idx
        super().process(ds)

With kind regards,
Christian

champs.jaideep · June 11, 2019, 5:34pm

yes thanks . It worked. problem ws i import PIL image also so it was not working with fai Image object.
some more asks

My loss dsnt improves after certain value like 1.70 .
I have doubt on the matches function if it is assigning the ground truth box correctly.
In my case there is only a single object to be detected and single class.
. Finall output of matches function ( that tends to assign gt bb to anchor) yields me anchor with dim more than 1 so n * 4 . and dim of target is 1 * 4 ( it expands as n*4 ) before going to loss function for regression.

I thought we should get only one anchor .that has gt box. by furhter using max
Here is my understanding on matches if i understand correctly

bbox_tgt, clas_tgt = self._unpad(bbox_tgt, clas_tgt)
            matches = match_anchors(self.anchors, bbox_tgt)
            bbox_mask = matches>=0
            if bbox_mask.sum() != 0:
                bbox_pred = bbox_pred[bbox_mask]
                bbox_tgt = bbox_tgt[matches[bbox_mask]]
                bb_loss = self.reg_loss(bbox_pred, bbox_to_activ(bbox_tgt, self.anchors[bbox_mask]))

a) calculate Iou between class object bbox and all possible anchor bbox
b) get the anchor which can identify a particular class out of multiple classes if any using iou. max
c) Apply threshold to separate the bg and foreground. and assign the id of class to all the anchor who are able to find this particular class id.
Now shldnt there be one more filter to get that anchor which further has got best iou match with Ground truth of class it is confident about ?

what is the purpose of activ to boox and vice versa . box to activation .
why is that needed.