How to Get Annotations from a CSV

charliec · May 13, 2020, 4:16pm

There is a new Kaggle competition with bbox and I’m following:

muellerzr/Practical-Deep-Learning-for-Coders-2.0/blob/master/Computer Vision/06_Object_Detection.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Run once per session\n",
    "!pip install fastai2 -q"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Object Detection\n",
    "\n",
    "Finding the localized area in which an object presides from two points, the bottom left and top right\n",
    "\n",

This file has been truncated. show original

To try to get the bbox but it’s not in Json and has no labels.
The label is essentially all the same, “wheat” but wondering if anyone would have a suggestion on how to do the get_annotations and if you would add the labels to the dataset?

This is the competition data:

I’ve looked at a few ways to do this but I’m stuck bouncing around various ideas.

muellerzr · May 13, 2020, 4:21pm

You’ll have better luck with this in the fastai2 subforum, so I’ve moved it there I’d recommend looking at the get_y function for keypoints that was shown in the DataBlock tutorial notebook (notebook 50). You’d want a get_y that reads your DataFrame and grabs the coordinates via this. There’s a few challenges you’ll have to think of, such as how do you deal with if there is no bounding box or no label or both! (We presume all labels are wheat in this competition though). My best advice would be simply ignore those and train on what’s fully labelled

Edit: I’m making a starter kernel for this currently, will update when it’s done

charliec · May 13, 2020, 4:35pm

Awesome thank you. And thanks for sharing on github too. That’s great stuff.

muellerzr · May 13, 2020, 5:16pm

Here it is @charliec https://www.kaggle.com/muellerzr/fastai2-starter-kernel

charliec · May 15, 2020, 11:05am

This is great. Thank you. I’m going to work through it this morning!

charliec · May 15, 2020, 12:10pm

@muellerzr When you get to line 26 through 31 can you help me know where you are looking at the original bbox dataframe and saw:
[834.0, 222.0, 56.0, 36.0]
and realized it needed to be different?

How did you know what the outcome for this bbox needed to be? I guess what I’m asking is you go through a bunch of steps to change that array but I wouldn’t know to change that because I’m not sure what the end result should be. What documentation says it needs to be:

x,y,w,h and we want x1,y1,x2,y2
and
we need to add our width and height to the respective x and y

I think I get the how to get those to 1 and 2 but I’m not sure I would have known I needed to do that. I hope this makes sense.
Thanks for the kernal and sorry for the questions

muellerzr · May 15, 2020, 12:24pm

That’s how fastai works simply you can tell the issue because if we dont do this, our show batch gives us very weird boxes and it doesn’t look right! (Also I originally made this mistake myself not realizing it then reread the data format and saw it was so). To make the adjustment we add the width to x and the height to y.

Im also very familiar with the source notebooks. Check out here where the BBox is worked with: https://dev.fast.ai/vision.core#TensorBBox you’ll see the correct format mentioned

charliec · May 16, 2020, 12:55pm

Thank you for the link. I learned a ton for your notebook. I’m not going to claim my Python skills allowed me to skim through it You did some serious data wrangling
It does make me wonder if there might not be a function for future bbox and csv files. I’ve only looked at a few like this so I’m not sure if there is a normal data source file with bbox.

muellerzr · May 16, 2020, 1:52pm

We have one for the json wrangling, but not CSV wrangling. In terms of weakness’ fastai for object detection isn’t very strong, and we need people comfortable with object detection to help improve it I think. Because also one of its strengths is the fact we have so much more data augmentation available for said problems.

Tendo · May 17, 2020, 6:04am

Have you tried porting the faster-rcnn pytorch implementaion to fastai2? I’m not sure of how to properly port the loss function calculation. Also I agree that object detection with fastai is still in it’s infancy but hopefully, that will change soon

charliec · May 17, 2020, 1:03pm

Tendo you read my mind. I was going to ask the same. If anyone wants to team up on Kaggle using fastai I’m up for it!