Part 2 Lesson 9 wiki

Can somebody explain or link a paper about clusters and scales? I think the clusters are basically what Jeremy was using for the anchor boxes last night when we started out with all of them even (1 cluster) and then added different sizes (so for YOLOv3 it would have 9 clusters). But would the scale mean that each of these will be a different size, so does this mean that there will be 27 total anchor boxes of the different dimensions (clusters) and each will have a small, medium, and large size (scale)?

From YOLOv3 last paragraph of 2.3

We still use k-means clustering to determine our bounding
box priors. We just sort of chose 9 clusters and 3
scales arbitrarily and then divide up the clusters evenly
across scales. On the COCO dataset the 9 clusters were:
(10×13),(16×30),(33×23),(30×61),(62×45),(59×
119),(116 × 90),(156 × 198),(373 × 326).

Hey folks don’t forget this is a wiki thread - so please copy useful links over to the top post so they’re all in one central place! :slight_smile:

You can do that by building a model that predicts just one number - the amount of rotation. It’s actually a great class project to try that previous students have found helpful.

1 Like

Remind me to do this next week if I forget. There should be a paper coming out on Arxiv tomorrow that discusses it.

4 Likes

I would expect so, since otherwise the gradients will be overwhelmed by the bit with the larger scale. But I haven’t tested this intuition to know how much it matters - it would be an interesting thing to experiment on and write about.

1 Like

Not yet - very little of the stuff shown in part 2 has been integrated into fastai as yet. It’s all new stuff and we’re using it to help you understand all the moving parts of deep learning and its implementation.

1 Like

Is there an updated link with the first half of the livestream? I’m reviewing and can’t see it yet.

Nearly! 3 zoom * 3 aspect = k = 9. At 3 scales, 4x4 (16) + 2x2 (4) + 1x1 (1) = 21. 21*9 = 189

5 Likes

I always post an edited video ~24 hours after the lecture is done. I’m working on it :slight_smile:

3 Likes

Thanks again Jeremy.

It’s a terrific paper and highly recommended. The author is, apparently, sick of the BS required to actually get published and so has decided to conspicuously do the opposite, which I think is quite awesome…

10 Likes

Let’s all do our best to be generous in our interpretations of people’s words. We’re all doing our best to figure things out here, and sometimes that means we’ll all write stuff which turns out to be not quite right - and that’s fine, because the following discussion will help resolve it.

6 Likes

It really was a fantastic read. I love this style.

I’ve updated the top post now with many of the resources recommended by you all in this thread.

I’ve now posted the edited video to the top post.

6 Likes

For some reason, the video does not load for me. All I can see if the title and a black box.

Yea it doesnt load for me either.

Youtube is still processing it. Should be done in ~5 mins.

edit: actually it’s taking a really long time. dunno if there’s a youtube problem. will re-upload soon if it doesn’t appear

Ah. Thanks :slight_smile:

I was wondering why we initialize our biases to -3 (then -4) in the output convolutional layers of our models. In the code it’s in the OutConv class in this line

self.oconv1.bias.data.zero_().add_(bias)

where bias is set as an argument of SSD_Head and SSD_MultiHead.

I’m guessing it’s to help the model train at the beginning (since those biases will change with the SGD being applied). I’ve tried putting 0 and it’s true we don’t get to the same losses in the same number of epochs, what’s the reason for this?

4 Likes