Part 2 Lesson 8 wiki

ecdrid · March 20, 2018, 2:20pm

This is my current understanding,
I am still trying to understand the code and it’s working properly, but what he said makes sense that we verify the bbox cor with the original ones by predicting them on the given dataset,(so we need to train them accordingly)

And probably using area , or ecludiean distance or something as a metric to calculate the error involved in the same

Drawing a bbox is easy as we just need to pass the coordinates just like Jeremy had shown as in the nbs
The main thing is to detect the object in the given image as there can be multiples of the same type…
Hope someone will correct me in my understanding

suvash · March 20, 2018, 2:21pm

I’ll have to rewatch the lectures later today / look at the notebook to give an exact answer.

But, I think might be on the right track there. (probably more like some loss function on each feature/class) Perfect time to open the notebook and read some code.

prajjwal1 · March 20, 2018, 2:22pm

Can someone tell about how to use open_image(). I am getting this type error from pathlib.py (TypeError: expected str, bytes or os.PathLike object, not dict) . I’ve tried typecasting but it is not working out ?

anshbansal · March 20, 2018, 2:26pm

what are you passing? Looks like a dictionary.

vikasbahirwani · March 20, 2018, 2:26pm

Yes Thank you

prajjwal1 · March 20, 2018, 2:33pm

I tried with this, open_image(IMG_PATH/im0_d[FILE_NAME]) . I tried typecasting as well, it’s not working .

anshbansal · March 20, 2018, 2:36pm

Just paste IMG_PATH/im0_d[FILE_NAME] in a cell and see what it is. Then paste type(IMG_PATH/im0_d[FILE_NAME]) in a cell and see what that is. I haven’t run the code yet so don’t know off the top of my head.

snagpaul · March 20, 2018, 2:41pm

For PyCharm and Mac users - a list of the shortcuts Jeremy provided for Visual Studio Code.

Action (PyCharm + Mac shortcut)
Command palette- (Shift + Command + A)
Select interpreter (for fastai env) - (Shift+Command+A) and then look for “interpreter”
Select terminal shell- (Shift+Command+A) and then look for “terminal shell”
Go to symbol (Option + Command + O)
Find references (Command+ G)(go down in the references) (Command + Shift + G) (go up)(Command + Function + F7) (look for all)
Go to definition (Command + Down Arrow Key)
Go back (Command + [ )
View documentation ( Option + Space) for viewing source and (Function + F1) for viewing documentation
Hide sidebar (Command + 1) redoing it will bring it back
Zen mode (Control + Command + F) and same to get out too.

Find them all with the (Shift + Command+ A) palette option for reference.

Probably not the best list (would love suggestions) and perhaps should create a new thread for it too. Just wanted to leave myself a note. Didn’t use symbols/shorthand for keys because I had trouble with them as a new Mac user once when I didn’t use shortcuts.

prajjwal1 · March 20, 2018, 2:42pm

I tried with IMG_PATH/im0_d[FILE_NAME] , it gives the same error.

anshbansal · March 20, 2018, 2:47pm

You will have to give more information than this. Exact Input/Exact Output otherwise it’s not possible to help. Read http://wiki.fast.ai/index.php/How_to_ask_for_Help

kmatsuda · March 20, 2018, 2:51pm

This is awesome, thanks! I’ve been using PyCharm and thought I should find the equivalents (especially “Go back”). Thanks for taking the time to write this up.

emilmelnikov · March 20, 2018, 3:03pm

If you need to typeset some pretty math:

Markdown cells accept LaTeX math inside dollar symbols: $\alpha$ becomes \alpha (now it works in Discourse too).
There is an awesome interactive online service for converting drawings into LaTeX math symbols.

karlaf · March 20, 2018, 3:04pm

what is the output of im0_d[FILE_NAME]?

sgugger · March 20, 2018, 3:07pm

This is probably nothing but I’m wondering if there’s a reason for the -1 in the definition of the train annotations dictionary:
bb = np.array([bb[1], bb[0], bb[3]+bb[1]-1, bb[2]+bb[0]-1])

If we look at the first annotations on image 12, the initial bbox is [155, 96, 196, 174] and the segmentation is [155, 96, 155, 270, 351, 270, 351, 96] which I interpret as the rectangle with (96,155) top left and (270,351) top bottom. So we would want [96,155,270,351] as our new bbox while we have [ 96, 155, 269, 350] in the notebook.

Furthermore, the bb_hw(a) function doesn’t do the inverse operation as bb_hw([ 96, 155, 269, 350]) returns [155,96,195,173]), we would need to have
return np.array([a[1],a[0],a[3]-a[1]+1,a[2]-a[0]+1])
to be consistent.

jeremy · March 20, 2018, 3:10pm

Our training set (nearly) always has labels for us. Our goal is to train a model that adds those labels to data that doesn’t have them.

For instance, a trained object detection model could be used in a self-driving car to identify the location of pedestrians and other cars.

jeremy · March 20, 2018, 3:12pm

The loss function in the single bounding box model we trained is L1 loss. You can see where I set learn.crit (and this is discussed in the video).

I suggest you re-watch the video today and hopefully it’ll all make sense! (and if not, don’t hesitate to ask…)

jeremy · March 20, 2018, 3:13pm

I’ve added the lesson video to the top post now.

jeremy · March 20, 2018, 3:14pm

I normally just go back to the debugging prompt and hit ‘q’.

digitalspecialists · March 20, 2018, 3:15pm

Does anyone know of any Bounding Box annotation tools, or specific crowdsource services, that we could use for our own datasets?

jeremy · March 20, 2018, 3:15pm

That’s because opencv doesn’t support pathlib. The correct way to call it is:

open_img(str(PATH/'something'))