[SOLVED] Bounding Boxes Not Properly Transformed by aug_transforms

I am trying to apply fastai2 framework on Kaggle Global Wheat Detection Challenge (object detection problem)

I build DataBlock and DataLoaders by the following:

def build_dblock(data_path, resize_sz, norm, rand_seed = 144, test_mode = False):
    json_path = data_path / 'train_mini.json' if test_mode else data_path / 'train.json'
    _, _, img2bbox = decode_coco_json(json_path)
    
    blks = (ImageBlock, BBoxBlock, BBoxLblBlock)
    
    get_ids_func = get_img_ids(json_path)
    getters_func = [lambda o: data_path / 'train' / o, 
                    lambda o: img2bbox[o][0], 
                    lambda o: img2bbox[o][1]]
    
    rand_splitter = RandomSplitter(valid_pct = 0.2, seed = rand_seed)
    batch_tfms = aug_transforms(size = resize_sz, min_scale = 0.85, do_flip = True)
    if norm: 
        batch_tfms += [Normalize.from_stats(*imagenet_stats)]
    
    dblock = DataBlock(
        blocks = blks, splitter = rand_splitter,
        get_items = get_ids_func, getters = getters_func,
        batch_tfms = batch_tfms, n_inp = 1
        )
    return dblock

def build_dataloaders(
    data_path, bs, resize_sz = 256, 
    norm = False, rand_seed = 144, test_mode = False
):
    """
    :param:
        data_path : str/ Path, path to wheat datasets
        resize_sz : int, length after resized (assume square)
        rand_seed : int, andom seed id
    """
    if isinstance(data_path, str):
        data_path = Path(data_path)
        
    dblk = build_dblock(data_path, resize_sz, norm = norm, 
                        rand_seed = rand_seed, test_mode = test_mode)
    dls = dblk.dataloaders(data_path / 'train', bs = bs)
    dls.c = 2
    return dls

But when I tried to show_batch from the DataLoaders built by the above function, I got the following. I think it has something to do with batch resizing in aug_transforms (everything is alright if I instead used Resize in item_tfms), but so far can’t pinpoint which lines of code causing the problem. Anyone has idea how to fix this?

@riven314 I’m working on this too. I’m down to trying to get faster rcnn to work. Be fun to team up if you are interested.

Hi @charliec , glad to hear you are also playing Wheat Challenge!

I tried torchvision.FasterRCNN last week but I found it is not quite compatible with fastai2 learner framework (i.e. the model has differential __forward__ behavior in train mode v.s. eval mode, giving breakdown losses in train mode but giving predicted bboxs in eval mode, and they expect different inputs). I made a few tweaks on the callback and the learner to make it work. Also, the model is not friendly to tweak hyperparamters in torchvision.FasterRCNN. I also found I can’t get a decent baseline performance using it in fastai2 (It really scratched my head, I got 0.59 mAP in LB, but some kernel got 0.66 mAP in LB simply using PyTorch)

Therefore, I changed to use fastai’s RetinaNet. I am building a training loop for it.

I was working through that exact kernel you linked to.

Well you probably saved me hours of messing around with FasterRCNN. Just curious if you had that kernal still just for the heck of it. I keep running into an error on my learn.summary:
AttributeError: ‘ImageList’ object has no attribute ‘shape’

It does seem like most of the high scores are using FasterRCNN or YOLO so it’s a quandary.

Previously, I got the same error. I ended up ignored that and kept training the models. (it’s okay learn.summary doesnt work sometimes). The prediction from the trained model makes sense, at last on validation set.

I am not working on kernel, it’s not convenient. But you can message me the issues you encountered (make it separate from this thread), I could help if I encountered that before.

I’ll be building a retinanet shortly too :wink: Good luck! The main fun is inference. I’m going to be stepping away from fastai for it and just use raw PyTorch, to avoid headaches. Also I made a starter kernel myself too, I can link it here if people are interested. It’s fairly long though :slight_smile:

1 Like

same here. I simply export the model weight and use the inference kernel from others. It seems easier for me.

Right now, I am struggling on interfacing the RetinaNet output with mAP metrics function. The bboxs output from RetinaNet need to be properly transformed before feeding it in mAP metrics function.

And I actually build my script mainly based on your tutorial. Thanks a lot! it’s great! @muellerzr

1 Like

I’d check out the work done here: Object detection using fastai v2 as he had worked on inference and got it working. Should help some :slight_smile: (I need to spend time looking at it, just busy schedule currently)

Also @riven314 check out the tail end of this notebook here. Jeremy updated the object detection notebook to fastai v1 and it shows inference! :slight_smile: https://github.com/fastai/course-v3/blob/master/nbs/dl2/pascal.ipynb (This is also what we’re using)

1 Like

@muellerzr
thanks for the pointers! They both fill up my missing parts! I will take a look on them.

Also back to your original issue here @riven314 , make sure your Resize parameters don’t use cropping. This can lead to issues like this on points I’ve found. See my post here: Useful Tip: Transforms supported for point regression, `TensorPoint`

All apply to Bounding Boxes as well :slight_smile:

1 Like

The problem is solved now. The cause comes from the fact that RandomResizedCropGPU in aug_transforms doesn’t support transformation TensorBBox/ TensorPoint. (i.e. RandomResizedCropGPU has no type-dispatch encodes method on them). Disabling RandomResizedCropGPU in aug_transforms solves the problem. (by setting min_scale = 1.).

Below is the corrected code snippets:

def build_dblock(data_path, resize_sz, norm, rand_seed = 144, test_mode = False):
    json_path = data_path / 'train_mini.json' if test_mode else data_path / 'train.json'
    _, _, img2bbox = decode_coco_json(json_path)
    
    blks = (ImageBlock, BBoxBlock, BBoxLblBlock)
    
    get_ids_func = get_img_ids(json_path)
    getters_func = [lambda o: data_path / 'train' / o, 
                    lambda o: img2bbox[o][0], 
                    lambda o: img2bbox[o][1]]
    
    rand_splitter = RandomSplitter(valid_pct = 0.2, seed = rand_seed)
    batch_tfms = aug_transforms(size = resize_sz, 
                                min_scale = 1.,  # <-- disable RandomResizedCropGPU
                                do_flip = True)
    if norm: 
        batch_tfms += [Normalize.from_stats(*imagenet_stats)]
    
    dblock = DataBlock(
        blocks = blks, splitter = rand_splitter,
        get_items = get_ids_func, getters = getters_func,
        batch_tfms = batch_tfms, n_inp = 1
        )
    return dblock

def build_dataloaders(
    data_path, bs, resize_sz = 256, 
    norm = False, rand_seed = 144, test_mode = False
):
    """
    :param:
        data_path : str/ Path, path to wheat datasets
        resize_sz : int, length after resized (assume square)
        rand_seed : int, andom seed id
    """
    if isinstance(data_path, str):
        data_path = Path(data_path)
        
    dblk = build_dblock(data_path, resize_sz, norm = norm, 
                        rand_seed = rand_seed, test_mode = test_mode)
    dls = dblk.dataloaders(data_path / 'train', bs = bs)
    dls.c = 2
    return dls

With the above codes, I can get the right image, bboxs pairs now.

Lastly, thanks @muellerzr for the reminder on cropping.

3 Likes

Great solution @riven314!

1 Like

Has anyone figured out how to decode bounding box transforms? I find out the some of the bounding boxes after the transform is applied have negative values and simply calling .decode on the dls doesn’t change them to their actual values

@tendo you need to call decode_batch() and pass in your input and the input coordinates

This works, thanks!

Hi! Do you have a workbook you can share?

hello

I’m trying to make object detection

for predicting the new image (not include in train, val, test set)

so, I tried 1)your object detection and 2) Object detection using fastai v2 and 3)fastai v1

but they just predict their validation set … not a unlabeled “new” one.

so i failed above 3 methods…

is there any other method ??
only iceVision possible?

sorry for bad English!