Part 2 Lesson 9 wiki

Jeremy: When do we apply dropout, when do we apply batchnorm?

What’s the rule of thumb here?

4 Likes

could we try to embed the category in a continuous space to frame this as a regression problem? (when combining the category with the bounding box)?

1 Like

I think that’s the point of

nn.Linear(256, 4+ len(cats))

What happens if we give it an image with no BB or Category?

1 Like

Won’t bb_i already be in the range [0,224] ?

1 Like

any reason to use L1 for bounding box (as opposed to others like L2 etc)?

1 Like

Will this Sigmoid transformation counted in the loss? How it will affect the loss?

precisely… i thought regression == L2norm

The general notion is that BatchNorm have some sort of regularizing effect, but its actual use for the neural net is to get faster convergence and make the training a bit more stable. On the other hand, Dropout is just for regularization.

3 Likes

A quick question. By creating two data loaders, will the performance be affected such as reading the same image twice? Thanks

Is it possible to constrain the bbox coordinates in some way? Say don’t let it exceed the image dimensions. Is there any evidence of this improving performance?

1 Like

It’s a common practice to use the L1 error since L2 error heavily penalises larger errors.

6 Likes

Are those custom metrics (detn_loss, and detn_accuracy) on the training set or the validation set? Is there a way to specify?

1 Like

I tried L2 and the boxes were pretty bad =X

2 Likes

I must have missed the “custom head” memo. Is there a section of a notebook somebody can link me to? Or a TL;DR that can be thrown in here?

1 Like

learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4)

basically retain everything in CNN except for the last layers, which you swap in your custom Sequential layers.

1 Like

We’re actually just using one dataloader to load the images, the other one only gives us the Y.

This http://forums.fast.ai/t/part-2-lesson-9-in-class/14028/61?u=ananda_seelan

So we’re basically re-building a network, but keeping the last few layers of a pre-trained network?

we’re throwing away the last few layers.

we’re keeping the convolution and filters.

2 Likes