Part 2 Lesson 8 wiki

(ecdrid) #100

This is my current understanding,
I am still trying to understand the code and it’s working properly, but what he said makes sense that we verify the bbox cor with the original ones by predicting them on the given dataset,(so we need to train them accordingly)

And probably using area , or ecludiean distance or something as a metric to calculate the error involved in the same

Drawing a bbox is easy as we just need to pass the coordinates just like Jeremy had shown as in the nbs
The main thing is to detect the object in the given image as there can be multiples of the same type…
Hope someone will correct me in my understanding

(Suvash Thapaliya) #101

I’ll have to rewatch the lectures later today / look at the notebook to give an exact answer.

But, I think might be on the right track there. (probably more like some loss function on each feature/class) Perfect time to open the notebook and read some code. :wink:


Can someone tell about how to use open_image(). I am getting this type error from (TypeError: expected str, bytes or os.PathLike object, not dict) . I’ve tried typecasting but it is not working out ?

(Aseem Bansal) #103

what are you passing? Looks like a dictionary.

(Vikas Bahirwani) #104

Yes Thank you


I tried with this, open_image(IMG_PATH/im0_d[FILE_NAME]) . I tried typecasting as well, it’s not working .

(Aseem Bansal) #106

Just paste IMG_PATH/im0_d[FILE_NAME] in a cell and see what it is. Then paste type(IMG_PATH/im0_d[FILE_NAME]) in a cell and see what that is. I haven’t run the code yet so don’t know off the top of my head.

(Sneha Nagpaul) #107

For PyCharm and Mac users - a list of the shortcuts Jeremy provided for Visual Studio Code.

Action (PyCharm + Mac shortcut)
Command palette- (Shift + Command + A)
Select interpreter (for fastai env) - (Shift+Command+A) and then look for “interpreter”
Select terminal shell- (Shift+Command+A) and then look for “terminal shell”
Go to symbol (Option + Command + O)
Find references (Command+ G)(go down in the references) (Command + Shift + G) (go up)(Command + Function + F7) (look for all)
Go to definition (Command + Down Arrow Key)
Go back (Command + [ )
View documentation ( Option + Space) for viewing source and (Function + F1) for viewing documentation
Hide sidebar (Command + 1) redoing it will bring it back
Zen mode (Control + Command + F) and same to get out too.

Find them all with the (Shift + Command+ A) palette option for reference.

Probably not the best list (would love suggestions) and perhaps should create a new thread for it too. Just wanted to leave myself a note. Didn’t use symbols/shorthand for keys because I had trouble with them as a new Mac user once when I didn’t use shortcuts.

Deep Learning Brasília - Lição 8

I tried with IMG_PATH/im0_d[FILE_NAME] , it gives the same error.

(Aseem Bansal) #110

You will have to give more information than this. Exact Input/Exact Output otherwise it’s not possible to help. Read

(Ken) #111

This is awesome, thanks! I’ve been using PyCharm and thought I should find the equivalents (especially “Go back”). Thanks for taking the time to write this up.

(Emil) #112

If you need to typeset some pretty math:

  • Markdown cells accept LaTeX math inside dollar symbols: $\alpha$ becomes \alpha (now it works in Discourse too).
  • There is an awesome interactive online service for converting drawings into LaTeX math symbols.

(karla f) #113

what is the output of im0_d[FILE_NAME]?


This is probably nothing but I’m wondering if there’s a reason for the -1 in the definition of the train annotations dictionary:
bb = np.array([bb[1], bb[0], bb[3]+bb[1]-1, bb[2]+bb[0]-1])

If we look at the first annotations on image 12, the initial bbox is [155, 96, 196, 174] and the segmentation is [155, 96, 155, 270, 351, 270, 351, 96] which I interpret as the rectangle with (96,155) top left and (270,351) top bottom. So we would want [96,155,270,351] as our new bbox while we have [ 96, 155, 269, 350] in the notebook.

Furthermore, the bb_hw(a) function doesn’t do the inverse operation as bb_hw([ 96, 155, 269, 350]) returns [155,96,195,173]), we would need to have
return np.array([a[1],a[0],a[3]-a[1]+1,a[2]-a[0]+1])
to be consistent.

(Jeremy Howard (Admin)) #115

Our training set (nearly) always has labels for us. Our goal is to train a model that adds those labels to data that doesn’t have them.

For instance, a trained object detection model could be used in a self-driving car to identify the location of pedestrians and other cars.

(Jeremy Howard (Admin)) #116

The loss function in the single bounding box model we trained is L1 loss. You can see where I set learn.crit (and this is discussed in the video).

I suggest you re-watch the video today and hopefully it’ll all make sense! :slight_smile: (and if not, don’t hesitate to ask…)

(Jeremy Howard (Admin)) #117

I’ve added the lesson video to the top post now.

(Jeremy Howard (Admin)) #118

I normally just go back to the debugging prompt and hit ‘q’.

(RobG) #119

Does anyone know of any Bounding Box annotation tools, or specific crowdsource services, that we could use for our own datasets?

(Jeremy Howard (Admin)) #120

That’s because opencv doesn’t support pathlib. The correct way to call it is: