Part 2 Lesson 9 wiki

binga · March 27, 2018, 2:24am

Anchor boxes are initially chosen by dividing the space equally as a grid.

KevinB · March 27, 2018, 2:24am

What is this Excel wizardry?

YangL · March 27, 2018, 2:25am

Quote:

"Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.

We use a totally different approach. We apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities."

https://pjreddie.com/darknet/yolo/

nchukaobah · March 27, 2018, 2:27am

is there any thing like a nonlocal receptive field? Using data farther from the center pixel

KevinB · March 27, 2018, 2:30am

Can you elaborate on what you mean?

yggg · March 27, 2018, 2:31am

i see in class StdConv, can you just do super.__init__()? as opposed to super(thisclass, self).__init__()?

surmenok · March 27, 2018, 2:31am

Why is bias value -3?

yggg · March 27, 2018, 2:32am

ok, i just googled it, it’s python3 syntax cool

YangL · March 27, 2018, 2:32am

where do you see that?

surmenok · March 27, 2018, 2:33am

head_reg4 = SSD_Head(k, -3.)

Even · March 27, 2018, 2:34am

Just to confirm an intuition, this architecture is finding object with non-overlapping bounding boxes?

rohitgeo · March 27, 2018, 2:35am

super.init() only works in Py3, but it could be done.

lgvaz · March 27, 2018, 2:36am

Yep, it’s actually recommended on python 3

emilmelnikov · March 27, 2018, 2:36am

Yes, but my understanding it that we’d like to avoid manual tuning as much as possible (somewhat in the spirit of differentiable programming).
Maybe we can scale the inputs by sticking them under log (this will still break on huge scale differences)? However, I’m not sure whether this makes any sense.

matttrent · March 27, 2018, 2:37am

Can you explain what the variable k represents in SSD_Head and it’s associated modules?

rachel · March 27, 2018, 2:40am

We will get into more detail later. These are anchor boxes, and the network will later learn offsets to them (allowing the boxes to overlap). They are used so that you don’t have all the boxes learning the same object when there are multiple objects in the image

snagpaul · March 27, 2018, 2:41am

But I wonder how we would change the enum stuff in the transformations and augmentations to accommodate this format of points.

Also, it’s no longer rectangles. It’s basically a quadrilateral. And our dataset is not labelled that way. I wonder if that would change anything. Just thinking out loud here.

lgvaz · March 27, 2018, 2:41am

Thinking about this I have a question, the difference in this scales of the loss functions matter?

mohsaad · March 27, 2018, 2:42am

could you reduce the anchor box size to 1x1 to get a segmentation of the entire image?

danielhunter · March 27, 2018, 2:42am

Yes, he’ll discuss that later probably