Lesson 7 in-class chat ✅

legsidestrangle · December 13, 2018, 3:03am

I already had fastao.vision, still get the error. I will check again once I am done with the lecture.

maral · December 13, 2018, 3:03am

Are DenseNet in part 2?

cedric · December 13, 2018, 3:03am

Code for visualizing the loss surface of neural nets. Load your pre-trained PyTorch model and get loss surface near the optimal parameters: https://github.com/tomgoldstein/loss-landscape

akshayb7 · December 13, 2018, 3:03am

Absolutely love the refactoring idea. Makes so much sense

GiantSquid · December 13, 2018, 3:04am

So the intuition behind the res block was that it would allow the net, in a worse case, to effectively ignore the intervening layers… but couldn’t it have done that already by setting the weights to function as an identity layer? Or is there no combination of weights that would do that?

OCData_nerd · December 13, 2018, 3:04am

Does ResNet50 use res blocks?

sgugger · December 13, 2018, 3:05am

Of course it could have, but as always in deep learning, if you make life easier to your model, it will learn in a more stable manner and more quickly.

MaheshKhatri · December 13, 2018, 3:06am

Figure 2 - Residual Block in the original Resnet paper is not very easy to understand for beginners.

A few weeks back, I wrote a Medium article explaining the same in more detail.

Link in case of interest is https://medium.com/@MaheshNKhatri/resnet-block-explanation-with-a-terminology-deep-dive-989e15e3d691

RajeshMappu · December 13, 2018, 3:06am

Why Resblocks improve the accuracy? They are just previous layer inputs added after each resulting output of convolution layers. Is it like more input data added? (like data augmentation, convolution layer output augmentations?)

Any pointers before I read the paper to know more about it?

PierreO · December 13, 2018, 3:06am

Didn’t really catch what’s going on with SequentialEx and x.orig. What’s the difference between that and the ones in the notebook ?

iyersathya · December 13, 2018, 3:06am

What’s the advantage of cat (densenet) over plus(resnet) ?

sgugger · December 13, 2018, 3:07am

You are keeping all of the features instead of adding them.

champs.jaideep · December 13, 2018, 3:09am

is there any notebook usign densenet instead of resnet

KarlH · December 13, 2018, 3:10am

How does concatenating every layer together in a Densenet work when the size of the image/feature maps is changing through the layers?

drscotthawley · December 13, 2018, 3:11am

Why are you using ReLU + Batch Normalization when the authors of ELU and SELU (https://arxiv.org/abs/1706.02515) have shown that these activations accomplish the work of Batch Norm, but without its slow performance?
(My own experiments with CNNs show similar results.)
Have you noticed them not working for you?

sgugger · December 13, 2018, 3:12am

they go through the same conv layers as the rest, so they are being downsampled as the rest.

RajeshMappu · December 13, 2018, 3:12am

Oh, I get it. Each layer features are added along… brilliant !!

lesscomfortable · December 13, 2018, 3:12am

Intuitively it is keeping the information from the previous layer’s input without any transformation (passing through a layer) which might be useful to the network.

RajeshMappu · December 13, 2018, 3:13am

Wow, this technique blew my mind! Each layers features. Wow.

dotkay · December 13, 2018, 3:14am

what do we mean “we start with”? does it mean we use the parameters of an already trained resnet model?