Lesson 7 in-class chat ✅

I already had fastao.vision, still get the error. I will check again once I am done with the lecture.

Are DenseNet in part 2? :slight_smile:

1 Like

Code for visualizing the loss surface of neural nets. Load your pre-trained PyTorch model and get loss surface near the optimal parameters: https://github.com/tomgoldstein/loss-landscape

10 Likes

Absolutely love the refactoring idea. Makes so much sense

So the intuition behind the res block was that it would allow the net, in a worse case, to effectively ignore the intervening layers… but couldn’t it have done that already by setting the weights to function as an identity layer? Or is there no combination of weights that would do that?

2 Likes

Does ResNet50 use res blocks?

1 Like

Of course it could have, but as always in deep learning, if you make life easier to your model, it will learn in a more stable manner and more quickly.

4 Likes

Figure 2 - Residual Block in the original Resnet paper is not very easy to understand for beginners.

A few weeks back, I wrote a Medium article explaining the same in more detail.

Link in case of interest is https://medium.com/@MaheshNKhatri/resnet-block-explanation-with-a-terminology-deep-dive-989e15e3d691

6 Likes

Why Resblocks improve the accuracy? They are just previous layer inputs added after each resulting output of convolution layers. Is it like more input data added? (like data augmentation, convolution layer output augmentations?)

Any pointers before I read the paper to know more about it?

Didn’t really catch what’s going on with SequentialEx and x.orig. What’s the difference between that and the ones in the notebook ?

5 Likes

What’s the advantage of cat (densenet) over plus(resnet) ?

You are keeping all of the features instead of adding them.

1 Like

is there any notebook usign densenet instead of resnet

1 Like

How does concatenating every layer together in a Densenet work when the size of the image/feature maps is changing through the layers?

4 Likes

Why are you using ReLU + Batch Normalization when the authors of ELU and SELU (https://arxiv.org/abs/1706.02515) have shown that these activations accomplish the work of Batch Norm, but without its slow performance?
(My own experiments with CNNs show similar results.)
Have you noticed them not working for you?

3 Likes

they go through the same conv layers as the rest, so they are being downsampled as the rest.

Oh, I get it. Each layer features are added along… brilliant !!

Intuitively it is keeping the information from the previous layer’s input without any transformation (passing through a layer) which might be useful to the network.

2 Likes

Wow, this technique blew my mind! Each layers features. Wow.

2 Likes

what do we mean “we start with”? does it mean we use the parameters of an already trained resnet model?

1 Like