Part 2 Lesson 12 wiki

(Rachel Thomas) #1

Ask your questions here. This is a wiki.

<<< Wiki: Lesson 11Wiki: Lesson 13 >>>



Timeline (incomplete)

  • (0:00:00) GANs
  • (0:01:15) Medicine language model
  • (0:03:00) Ethical issues
  • (0:04:35) Should I share my interesting ideas?
  • (0:05:28) Talking about CIFAR 10
  • (0:07:22) About notebook
  • (0:08:05) About CIFAR10 dataset
  • (0:09:07) exercise for understanding cifar 10 average and standard deviation
  • (0:09:50) Batch size, transforms, padding (not black padding instead reflection)
  • (0:11:07) Architecture
  • (0:11:52) Architecture - stacked, hierarchy of layers
  • (0:12:42) Leaky relu
  • (0:14:06) Sequential instead of Pytorch module
  • (0:14:44) Resnet Hierarchy
  • (0:16:55) Number of channels
  • (0:17:55) In place true Leaky ReLU, in place in conv layers, Dropouts, activation, arithmetic
  • (0:19:53) Bias set to false in conv layers
  • (0:21:12) conv layer padding
  • (0:22:13) bottleneck in conv layer
  • (0:26:10) Wide resnet
  • (0:28:25) papers that talk about architectures
  • (0:29:45) SeLU
  • (0:31:40) Darknet module
  • (0:35:00) Adaptive average pooling
  • (0:37:34) Dawn bench test
  • (0:38:00) Parameters of python script with AWS p3
  • (0:40:37) Adaptive average pooling explanation
  • (0:58:40) Discrimative GAN code
  • (0:59:50) data required for GAN - no answer
  • (1:01:10) huge speed up reason - Nvidia GPU
  • (1:03:40) Neural Net - inputs and outputs
  • (1:05:30) Discrimantor if generator was present
  • (1:06:05) Generator - prior (random numbers)
  • (1:07:55) BatchNorm before relu order
  • (1:09:40) Back to generator - DeConvolution
  • (1:10:27) De conv in excel
  • (1:13:45) Discriminator for fake news
  • (1:16:02) Conv Transpose 2D
  • (1:16:37) Theano website example animation
  • (1:18:20) Unsample vs conv transpose 2D
  • (1:22:30) DeConv series to make it bigger and bigger

Lesson Index
About the Part 2 & Alumni category

Why is bias usually (like in resnet) set to False in conv_layer?

(Kaitlin Duck Sherwood) #15

Why LeakyReLU instead of SELU?

(Kevin Bird) #16

Do you have a link to SELU?

(Armineh Nourbakhsh) #17

Maybe because of batchnorm? I vaguely remember from Andrew Ng’s class that the beta parameter in batchnorm practically replaces bias.

(Hamel Husain) #18

Why is inplace=True in the Leaky-Relu?

(Brian Holland) #19

@rachel Is there any benefit to drawing the architecture? Like as in a line-drawing?

Are there good packages that can illustrate a pre-defined architecture?

(Kaitlin Duck Sherwood) #20

Here’s a blog on SELU:

I guess another way of asking the question is, “SELU looked really hot in the paper which came out, but I notice that you don’t use it. What’s your opinion on SELU?”


Ah, thank you, make sense! It’d be great if we can get a confirmation from @jeremy

(Daniel Hunter) #22

Good call!

(Daniel Hunter) #23

Sorry – where’s the link?

(Brian Holland) #24

Isn’t a small network that just narrows and re-expands called an “encapsulation network”? @rachel?

(Kevin Bird) #25

Is the reason for squishing it down and expanding the same idea as U-Net?

(Adrien Lucas Ecoffet) #26

Would be curious about Jeremy’s answer, but here’s my take: the SELU paper focuses on standard dense neural nets as opposed to CNNs or RNNs. I am not 100% sure why but I did try SELU on CNNs once and got poor results, so it’s possible that it simply doesn’t work that well on CNNs. Why? No idea…

(Ravi Jain) #27

So this should be applicable to simple fully connected linear layers as well, right?

(Patrick Mccaffrey) #28

Why does ks//2 have 2 forward slashes?

(Debashish Panigrahi) #29

// converts to int

(Kevin Bird) #30

Python has two division operators, a single slash character for classic division and a double-slash for “floor” division (rounds down to nearest whole number). Classic division means that if the operands are both integers, it will perform floor division, while for floating point numbers, it represents true division.

(Bart Fish) #31

integer division

(Ravi Jain) #33

Integer division!