Part 2 Lesson 9 wiki

He didn’t really mean 4+4+21=26?

3 Likes

Does creating pandas DataFrames introduce any overhead that might compromise performance?

1 Like

Maybe it works. We might have to do an experiment and see for ourselves.

Yes, because if one kind of input has it’s loss around 10 and the other is around 1, the former will dominate the loss, and the model won’t learn to do a good job for the latter.

2 Likes

The 3 objects that print out, all are indexed somewhere, where is that again?

Just want to say Lecture 8 CS231N W2016 and Lecture 11 CS231N S2017 make excellent companion lectures to this lesson. The first halves are more/less on R-CNN, but the latter halves cover YOLO/SSD.

20 Likes

use a conv layer that outputs that… not necessarily stride2

2 Likes

Does anybody have a good article to explain Jaccard Indexes?

3 Likes

Random question: And also, what does this difficulty of object detection in an image say about captchas, why is that more difficult, bc it’s computationally expensive for a bot to do this when it asks humans to identify what is and isn’t in an image (usually used for password protection/security questions)?

4 Likes

I want to know the answer to this too!

Why would spatial transformer liket his not work?

http://pytorch.org/tutorials/intermediate/spatial_transformer_tutorial.html

1 Like

Your captcha answers are being used as data to train neural nets, so I would bet that if deep learning can’t break captchas yet, it will be able to very soon in the future

1 Like

You could even build this yourself for a small fee by getting a site to crowdsource solving these for your data.

How do the bounding boxes span across the anchor boxes? Aren’t we predicting if the object is in the anchor box?

1 Like

Okay, I missed something: why are the bounding boxes found not exactly equal to the anchor boxes?

2 Likes

probably merging boxes having same labels spanning across anchor boxes

The anchor boxes evenly divide the image (say into a 2x2 or 4x4 grid), while the bounding boxes are still the rectangles that closely surround the object

3 Likes

Okay, I missed this point:

are t, x one hot encoded?

What’s their dimensions?

How do we decide how the bounding box is aligned? Stretched horizontally or vertically?

Or are we simply combining anchor boxes based on IOU?

1 Like

Who came up with the Jaccard Index Trick? Its pretty cool !

and with the one-hot encoding in the loss function, why did he add one and then immediately subtract it? how is that different from not adding one / not subtracting it?