Part 2 Lesson 8 wiki

This is my current understanding,
I am still trying to understand the code and it’s working properly, but what he said makes sense that we verify the bbox cor with the original ones by predicting them on the given dataset,(so we need to train them accordingly)

And probably using area , or ecludiean distance or something as a metric to calculate the error involved in the same

Drawing a bbox is easy as we just need to pass the coordinates just like Jeremy had shown as in the nbs
The main thing is to detect the object in the given image as there can be multiples of the same type…
Hope someone will correct me in my understanding

1 Like

I’ll have to rewatch the lectures later today / look at the notebook to give an exact answer.

But, I think might be on the right track there. (probably more like some loss function on each feature/class) Perfect time to open the notebook and read some code. :wink:

Can someone tell about how to use open_image(). I am getting this type error from pathlib.py (TypeError: expected str, bytes or os.PathLike object, not dict) . I’ve tried typecasting but it is not working out ?

what are you passing? Looks like a dictionary.

Yes Thank you

I tried with this, open_image(IMG_PATH/im0_d[FILE_NAME]) . I tried typecasting as well, it’s not working .

Just paste IMG_PATH/im0_d[FILE_NAME] in a cell and see what it is. Then paste type(IMG_PATH/im0_d[FILE_NAME]) in a cell and see what that is. I haven’t run the code yet so don’t know off the top of my head.

For PyCharm and Mac users - a list of the shortcuts Jeremy provided for Visual Studio Code.

Action (PyCharm + Mac shortcut)
Command palette- (Shift + Command + A)
Select interpreter (for fastai env) - (Shift+Command+A) and then look for “interpreter”
Select terminal shell- (Shift+Command+A) and then look for “terminal shell”
Go to symbol (Option + Command + O)
Find references (Command+ G)(go down in the references) (Command + Shift + G) (go up)(Command + Function + F7) (look for all)
Go to definition (Command + Down Arrow Key)
Go back (Command + [ )
View documentation ( Option + Space) for viewing source and (Function + F1) for viewing documentation
Hide sidebar (Command + 1) redoing it will bring it back
Zen mode (Control + Command + F) and same to get out too.

Find them all with the (Shift + Command+ A) palette option for reference.

Probably not the best list (would love suggestions) and perhaps should create a new thread for it too. Just wanted to leave myself a note. Didn’t use symbols/shorthand for keys because I had trouble with them as a new Mac user once when I didn’t use shortcuts.

16 Likes

I tried with IMG_PATH/im0_d[FILE_NAME] , it gives the same error.

You will have to give more information than this. Exact Input/Exact Output otherwise it’s not possible to help. Read http://wiki.fast.ai/index.php/How_to_ask_for_Help

This is awesome, thanks! I’ve been using PyCharm and thought I should find the equivalents (especially “Go back”). Thanks for taking the time to write this up.

3 Likes

If you need to typeset some pretty math:

  • Markdown cells accept LaTeX math inside dollar symbols: $\alpha$ becomes \alpha (now it works in Discourse too).
  • There is an awesome interactive online service for converting drawings into LaTeX math symbols.
5 Likes

what is the output of im0_d[FILE_NAME]?

This is probably nothing but I’m wondering if there’s a reason for the -1 in the definition of the train annotations dictionary:
bb = np.array([bb[1], bb[0], bb[3]+bb[1]-1, bb[2]+bb[0]-1])

If we look at the first annotations on image 12, the initial bbox is [155, 96, 196, 174] and the segmentation is [155, 96, 155, 270, 351, 270, 351, 96] which I interpret as the rectangle with (96,155) top left and (270,351) top bottom. So we would want [96,155,270,351] as our new bbox while we have [ 96, 155, 269, 350] in the notebook.

Furthermore, the bb_hw(a) function doesn’t do the inverse operation as bb_hw([ 96, 155, 269, 350]) returns [155,96,195,173]), we would need to have
return np.array([a[1],a[0],a[3]-a[1]+1,a[2]-a[0]+1])
to be consistent.

3 Likes

Our training set (nearly) always has labels for us. Our goal is to train a model that adds those labels to data that doesn’t have them.

For instance, a trained object detection model could be used in a self-driving car to identify the location of pedestrians and other cars.

2 Likes

The loss function in the single bounding box model we trained is L1 loss. You can see where I set learn.crit (and this is discussed in the video).

I suggest you re-watch the video today and hopefully it’ll all make sense! :slight_smile: (and if not, don’t hesitate to ask…)

3 Likes

I’ve added the lesson video to the top post now.

2 Likes

I normally just go back to the debugging prompt and hit ‘q’.

3 Likes

Does anyone know of any Bounding Box annotation tools, or specific crowdsource services, that we could use for our own datasets?

1 Like

That’s because opencv doesn’t support pathlib. The correct way to call it is:

open_img(str(PATH/'something'))
3 Likes