What does SSD loss stand for again?
Single Shot Detector
so whatever classes are present it will only detect those , how about if a background image consists of an object which is not present in the classes , may be in that case softmax would take a leap forward?
For the multi label classification, why aren’t we multiplying the categorical loss by a constant like we did before? Before we multiplied categorical loss by 20 when we added it to bounding box loss.
why do we need even more bounding boxes, when one problem is some items (e.g. potted plant) are much larger than the anchor boxes already?
we need bounding boxes of different sizes and different aspect ratios
okay, got it - thanks
anybody else getting lost in all of this or is it only me?
So is the Jaccard index between the Bounding box and the Anchor box?
Yeah, I’m lost too. I don’t understand why we did all that effort to figure out which anchor box the bounding boxes belonged to. I thought we were saying that the anchor box was going to be the bounding box, but then bounding box came in from somewhere (from training, I guess), but… then… why were we doing anchor boxes?
Bounding boxes from ‘ground truth’ are bounding boxes from the training data.
Anchor boxes are our way of “guessing” where everything is assuming we don’t know where the objects are, as in the case of testing (and production).
Right?
I think Jeremy didn’t really explain why the anchorboxes are needed to detect multiple objects
aren’t we modifying the anchor boxes (size, placement) to be bounding boxes?
2 reasons:
[1] you can avoid prediction clashes with more anchor boxes
[2] aspect ratios can help with IOU & clashes (person is vertical / motorcycle is horizontal - so the vertical and horizontal aspect ratios make it easy for the model)
Guess, It’ll take couple rewatch to grasp all that xD
I’m definitely going to need to put some time in this week to figure it all out.
There are a lot of pieces to keep track of, so this is material you’ll need to go back over a few times. It’s more helpful to have specific questions to ask during the lesson.
I think they only map the dataset so we can assign the SSD loss properly. It doesn’t look like a classifier for the (anchor) boxes.
Rewriting it is probably the only way.
And… why do we need to know what anchor box the object is in?