Part 2 Lesson 9 wiki

in my experience, drop out rate is just something you have to try different values for until you get something that works well for you

It’s here.

8 Likes

you can apply some kind of hyperparameter optimization algorithm to it if you want to be systematic

Can Jeremy explain use_clr=(32,5) usage in learn.fit?

3 Likes

any reason in the bbox only, you did not use sigmoid * 224 to bound the output of bbox prediction but you use it in the bbox and cat prediction loss function?

1 Like

take a look here: http://forums.fast.ai/t/understanding-use-clr/13969

3 Likes

When designing a loss function with inputs of multiple kinds (i.e. L1 loss for bounding boxes and maximum likelihood loss for classes), how do we control weights for these kinds, like Jeremy did it in the lecture by multiplying one of them by 20, but without manual examination?

4 Likes

Well, you’ll have to throw away your old model unless you adjust the weights.

So, do you just use a new model with new dropout rates every time and see which one has the best loss?

is this concept the method behind most fully-convolutional nets? i’m thinking back to this paper - https://vision.cornell.edu/se3/wp-content/uploads/2017/07/LCDet_CVPRW.pdf

How do we know how many objects will we have in an image?

The idea is to essentially bring both the losses to the same scale range. That’s still manual. Maybe there’s a better way of doing it.

2 Likes

with stride 2 how would be come up with channels to be 4+c

Does it matter how many objects are in the image?

Are we just mapping the remaining outputs to all 0 when there are fewer than 16 objects?

Again, please forgive my ignorance. What does YOLO stand for in this case?

How to choose anchor boxes ? Does it matter if a anchor box has only a part of an object and another box has rest of the object ?

You Only Look Once

1 Like

Anchor boxes are initially chosen by dividing the space equally as a grid.

2 Likes

What is this Excel wizardry?

22 Likes

Quote:

"Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.

We use a totally different approach. We apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities."

https://pjreddie.com/darknet/yolo/

is there any thing like a nonlocal receptive field? Using data farther from the center pixel