Jeremy: When do we apply dropout, when do we apply batchnorm?
What’s the rule of thumb here?
Jeremy: When do we apply dropout, when do we apply batchnorm?
What’s the rule of thumb here?
could we try to embed the category in a continuous space to frame this as a regression problem? (when combining the category with the bounding box)?
I think that’s the point of
nn.Linear(256, 4+ len(cats))
What happens if we give it an image with no BB or Category?
Won’t bb_i already be in the range [0,224] ?
any reason to use L1 for bounding box (as opposed to others like L2 etc)?
Will this Sigmoid transformation counted in the loss? How it will affect the loss?
precisely… i thought regression == L2norm
The general notion is that BatchNorm have some sort of regularizing effect, but its actual use for the neural net is to get faster convergence and make the training a bit more stable. On the other hand, Dropout is just for regularization.
A quick question. By creating two data loaders, will the performance be affected such as reading the same image twice? Thanks
Is it possible to constrain the bbox coordinates in some way? Say don’t let it exceed the image dimensions. Is there any evidence of this improving performance?
It’s a common practice to use the L1 error since L2 error heavily penalises larger errors.
Are those custom metrics (detn_loss, and detn_accuracy) on the training set or the validation set? Is there a way to specify?
I tried L2 and the boxes were pretty bad =X
I must have missed the “custom head” memo. Is there a section of a notebook somebody can link me to? Or a TL;DR that can be thrown in here?
learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4)
basically retain everything in CNN except for the last layers, which you swap in your custom Sequential layers.
We’re actually just using one dataloader to load the images, the other one only gives us the Y.
So we’re basically re-building a network, but keeping the last few layers of a pre-trained network?
we’re throwing away the last few layers.
we’re keeping the convolution and filters.