Lesson 8 (2019) discussion & wiki

jcd · March 19, 2019, 3:50am

Why is yhat a result of mse? I thought yhat was the output of lin2, and mse was only a way of measuring how good our predictor is.

drscotthawley · March 19, 2019, 3:50am

and those propagate back to outside the function? (so, classes are passed by reference?)

ymittal23 · March 19, 2019, 3:51am

What is .clone() exactly doing?

sgugger · March 19, 2019, 3:51am

That’s python for you, most objects are mutable.

magiclantern · March 19, 2019, 3:52am

I’ve run into this elsewhere in python (think it was scikit-learn, but can’t remember specifically now) and was totally confused by the transposition. I still don’t understand it, so if someone can explain, that’d be awesome.

zachcaceres · March 19, 2019, 3:52am

the model has layers referenced in its constructor. if we mutate the referenced class, we can access the value we’ve added to it (as .g) from within the main class (model)

Borz · March 19, 2019, 3:52am

From what I read, looks like the idea is to head toward a language that integrates deep learning / numerical computation / differentiable programming. So the idea is that Swift will grow out of the “… for TF” part. I think. Anyone chime in who knows more.

sgugger · March 19, 2019, 3:52am

It’s copying your tensor during assignment. Otherwise it would just copy the reference to that tensor and not the actual object.

lucaslooper · March 19, 2019, 3:53am

Just to confirm, the “.clamp_min(0.)-0.5” is part of the tweaking?

jcd · March 19, 2019, 3:53am

I believe so, yes

paul · March 19, 2019, 3:53am

I hadn’t noticed this fixup initialization paper until Jeremy mentioned it today and it looks very interesting. I haven’t fully grasped that paper yet, but does anyone know if some of these ideas or related ideas could apply to RNNs as well? Are there any other ways to improve over the LSTM/GRU (or antisymmetric RNN)?

KevinB · March 19, 2019, 3:53am

Are __call__ and __init__ completely independent? For example if I have a __init__ variable, would it be available in a __call__

sgugger · March 19, 2019, 3:53am

Yes, regular ReLU doesn’t have the -0.5

knguyen · March 19, 2019, 3:54am

So it’s like copy.deepcopy()?

sgugger · March 19, 2019, 3:54am

But for PyTorch’s tensors.

lucaslooper · March 19, 2019, 3:56am

What’s the intuition behind it?

sgugger · March 19, 2019, 3:56am

Rewatch the video tomorrow, Jeremy explained it

rachel · March 19, 2019, 3:56am

init is called when you instantiate a variable (which you would have to do before calling it)

swagman · March 19, 2019, 3:57am

Module should have bwd() shouldn’t it?

CGF · March 19, 2019, 3:57am

shouldn’t einsum be written
as: ib bj–> ij (not bi bj–> ij)? was that a typo?