Lesson 9 Discussion & Wiki (2019)

(Rachel Thomas) #1

Please use this thread to discuss lesson 9. Since this is Part 2, feel free to ask more advanced or slightly tangential questions - although if your question is not related to the lesson much at all, please use a different topic. Note that this is a forum wiki thread, so you all can edit this post to add/change/organize info to help make it better!

Thread for general chit chat (we won’t be monitoring this).

Lesson resources

Errata

  • in nb 02b_initializing.ipynb a few minor corrections were made where a variance and std were mixed up. Thanks to Aman Madaan for pointing those out.
  • In 03_minibatch_training.ipynb, there is a small error in the Optimizer() class. The step() method should be:
    def step(self):
        with torch.no_grad():
            for p in self.params: p -= p.grad * self.lr

instead of

    def step(self):
        with torch.no_grad():
            for p in self.params: p -= p.grad * lr

(missing self in the learning rate update formula – the method still works because lr was declared as a global variable earlier.)

Things mentioned in the lesson

Papers

continued from last time:

Other helpful resources

“Assigned” Homework

Lesson notes

Fastai Deep Learning From the Foundations TWiML Study Group

15 Likes

2019 Part 2 Lessons, Links and Updates
(Theodoros Galanos) #16

This lecture deserves a How to Train your [Dragon] Model poster, someone more artistic than me please make it :slight_smile:

8 Likes

(WG) #18

What is the refactoring process by which these nb_XX.py python files (and their code which is sometimes duplicated) get turned into things like fastai.text.data and so forth?

1 Like

#19

The refactoring happens in the notebooks. We only turned them in a library when they looked nice and cosy.

1 Like

(WG) #20

Is that process going to be covered as well?

0 Likes

(Ilia) #21

I believe it is kind of covered right now =)

0 Likes

#22

It’s what Jeremy does in each of those notebooks. Later in this lesson, you’ll see a training loop and CallbackHandler even better than what we have inside the library.

4 Likes

(unknown) #23

I don’t understand: would’nt it make more sense for the “leak” value to be the -Slope, rather than slope?

0 Likes

(Walter Wiggins) #24

The slope of the left-hand side of the leaky ReLU is still positive.

2 Likes

#25

Yes I didn’t catch Jeremy saying minus, but if he said minus, it was just a mistake. The slope on the negative size is still positive.

0 Likes

#26

which notebook did Jeremy say we’re following? 02a…?

0 Likes

#27

Yes, it’s the one.

0 Likes

(alando) #28

What is gain for again?

1 Like

#30

It’s the multiplier you have to use to take into account the slope of your leaky really when initializing your layers.

3 Likes

(Ilia) #31

Here is the notebook about sqrt(5) discussed in the lecture.

1 Like

(alando) #32

according to Kaiming initializations?

1 Like

#33

Yes, it’s the formula in the paper we mentioned last week.

1 Like

(alando) #34

great, thanks for clarifying :+1:

0 Likes

(nok) #35

Why uniform instead of normal distribution here?

4 Likes

#36

That’s the default pytorch init.

1 Like