Lesson 3 In-Class Discussion

guthl · November 14, 2017, 3:42am

@zaoyang It is not so much loss than the data is compressed. Deep Learning is about reducing the noice inside the data and getting its essential characteristics.
Their is a lot of loss happening in the max pooling layer. This is something people do not like especially Hinton (one of the godfather). New research is coming in to solve that.
But we keep using them because it just works in practice.

CTaylor · November 14, 2017, 3:44am

Can we please pick our own seats next time? I can’t see with where my group is placed.

anandsaha · November 14, 2017, 3:45am

It’s actually weighted. It’s the 3x3 subset coming from your layer weighted by your filter. It’s the weighted sum. (was that your question?)

Chris_Palmer · November 14, 2017, 3:47am

Regarding the spreadsheet - I have a different set of values (see below) in filters 1/1 and 2/1, and then in the conv2 layer, I guess the spreadsheet might be out of sync?

lgvaz · November 14, 2017, 3:50am

Any good resources for the fully convolutional layers?

zpnc · November 14, 2017, 3:51am

@jeremy you’ve briefly mentioned that you used network pre-trained on 3-channel input on a 4-channel input (RGB+NIR), by adding additional dimension to all pertained filters (initialized to 0 or perhaps random). Did you then ‘unfreeze’ these additional new filter weights along with the ones for RGB channels and train them?

loldja · November 14, 2017, 3:52am

Still muted… @jeremy

…Is it just me? Who is the TA I should @ ?

pete.condon · November 14, 2017, 3:56am

We’re getting sound

memetzgz · November 14, 2017, 3:56am

perhaps just you @loldja, I can hear him

loldja · November 14, 2017, 3:57am

Hm ok thanks guys.

Got it. I feel like my video breaks every time we go on break, lol

zaoyang · November 14, 2017, 4:00am

okay but there are a lot of layers and they all “learn” different things. If you are pruning out non relevant data for one layer, it could be essential for another layer. So I guess the question is how can each layer learn “essential data” independently whereas the data flows through the layers in a dependent way.

pete.condon · November 14, 2017, 4:01am

That’s what SGD is for, the model is learning what is important to produce a good result.

kcturgutlu · November 14, 2017, 4:07am

where is planet data ?

pete.condon · November 14, 2017, 4:07am

jenna · November 14, 2017, 4:08am

If you’re like me and can’t remember what SGD means, it’s stochastic gradient descent.

johnnyv · November 14, 2017, 4:08am

Is there a way to specify that the cloudy, clear labels are a softmax type, while the cover labels should be done with a different activation?

Even · November 14, 2017, 4:09am

Does anyone remember (or can we get a reminder) of the commands within ipython to see the function’s input parameters and to see it’s source. I was looking for that part of the previous lectures and couldn’t find it.

KevinB · November 14, 2017, 4:10am

shift + tab will give the input parameters, ??command_name will give you the source, Shift + tab 3 times will give you the documentation

trusttheai · November 14, 2017, 4:17am

what is the activation function used for multi label classification?

Sree · November 14, 2017, 4:17am

My image_models notebook is a little different from what jeremy is explaining now. i did a git pull but it shows the same as i had before.