Part 2 Lesson 9 wiki

(YangLu) #46

Jeremy: When do we apply dropout, when do we apply batchnorm?

What’s the rule of thumb here?

(Tyler C Nisonoff) #47

could we try to embed the category in a continuous space to frame this as a regression problem? (when combining the category with the bounding box)?

(YangLu) #48

I think that’s the point of

nn.Linear(256, 4+ len(cats))

(Bart Fish) #49

What happens if we give it an image with no BB or Category?

(Rohit Singh) #50

Won’t bb_i already be in the range [0,224] ?


any reason to use L1 for bounding box (as opposed to others like L2 etc)?

(Aleksandr) #52

Will this Sigmoid transformation counted in the loss? How it will affect the loss?

(Apil Tamang) #53

precisely… i thought regression == L2norm

(Ananda Seelan) #54

The general notion is that BatchNorm have some sort of regularizing effect, but its actual use for the neural net is to get faster convergence and make the training a bit more stable. On the other hand, Dropout is just for regularization.

(Yihui Ray Ren) #55

A quick question. By creating two data loaders, will the performance be affected such as reading the same image twice? Thanks

(Rudraksh Tuwani) #56

Is it possible to constrain the bbox coordinates in some way? Say don’t let it exceed the image dimensions. Is there any evidence of this improving performance?

(Phani Srikanth) #57

It’s a common practice to use the L1 error since L2 error heavily penalises larger errors.

(blake west) #58

Are those custom metrics (detn_loss, and detn_accuracy) on the training set or the validation set? Is there a way to specify?

(Lucas Goulart Vazquez) #59

I tried L2 and the boxes were pretty bad =X

(Brian Holland) #60

I must have missed the “custom head” memo. Is there a section of a notebook somebody can link me to? Or a TL;DR that can be thrown in here?

(YangLu) #61

learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4)

basically retain everything in CNN except for the last layers, which you swap in your custom Sequential layers.

(Lucas Goulart Vazquez) #62

We’re actually just using one dataloader to load the images, the other one only gives us the Y.

(Ananda Seelan) #63


(Brian Holland) #64

So we’re basically re-building a network, but keeping the last few layers of a pre-trained network?

(YangLu) #65

we’re throwing away the last few layers.

we’re keeping the convolution and filters.