Part 2 Lesson 9 wiki


(YangLu) #46

Jeremy: When do we apply dropout, when do we apply batchnorm?

What’s the rule of thumb here?


(Tyler C Nisonoff) #47

could we try to embed the category in a continuous space to frame this as a regression problem? (when combining the category with the bounding box)?


(YangLu) #48

I think that’s the point of

nn.Linear(256, 4+ len(cats))


(Bart Fish) #49

What happens if we give it an image with no BB or Category?


(Rohit Singh) #50

Won’t bb_i already be in the range [0,224] ?


#51

any reason to use L1 for bounding box (as opposed to others like L2 etc)?


(Aleksandr) #52

Will this Sigmoid transformation counted in the loss? How it will affect the loss?


(Apil Tamang) #53

precisely… i thought regression == L2norm


(Ananda Seelan) #54

The general notion is that BatchNorm have some sort of regularizing effect, but its actual use for the neural net is to get faster convergence and make the training a bit more stable. On the other hand, Dropout is just for regularization.


(Yihui Ray Ren) #55

A quick question. By creating two data loaders, will the performance be affected such as reading the same image twice? Thanks


(Rudraksh Tuwani) #56

Is it possible to constrain the bbox coordinates in some way? Say don’t let it exceed the image dimensions. Is there any evidence of this improving performance?


(Phani Srikanth) #57

It’s a common practice to use the L1 error since L2 error heavily penalises larger errors.


(blake west) #58

Are those custom metrics (detn_loss, and detn_accuracy) on the training set or the validation set? Is there a way to specify?


(Lucas Goulart Vazquez) #59

I tried L2 and the boxes were pretty bad =X


(Brian Holland) #60

I must have missed the “custom head” memo. Is there a section of a notebook somebody can link me to? Or a TL;DR that can be thrown in here?


(YangLu) #61

learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4)

basically retain everything in CNN except for the last layers, which you swap in your custom Sequential layers.


(Lucas Goulart Vazquez) #62

We’re actually just using one dataloader to load the images, the other one only gives us the Y.


(Ananda Seelan) #63

This http://forums.fast.ai/t/part-2-lesson-9-in-class/14028/61?u=ananda_seelan


(Brian Holland) #64

So we’re basically re-building a network, but keeping the last few layers of a pre-trained network?


(YangLu) #65

we’re throwing away the last few layers.

we’re keeping the convolution and filters.