Lesson 9 Discussion & Wiki (2019)

Please use this thread to discuss lesson 9. Since this is Part 2, feel free to ask more advanced or slightly tangential questions - although if your question is not related to the lesson much at all, please use a different topic. Note that this is a forum wiki thread, so you all can edit this post to add/change/organize info to help make it better!

Thread for general chit chat (we won’t be monitoring this).

Lesson resources

Errata

  • in nb 02b_initializing.ipynb a few minor corrections were made where a variance and std were mixed up. Thanks to Aman Madaan for pointing those out.
  • In 03_minibatch_training.ipynb, there is a small error in the Optimizer() class. The step() method should be:
    def step(self):
        with torch.no_grad():
            for p in self.params: p -= p.grad * self.lr

instead of

    def step(self):
        with torch.no_grad():
            for p in self.params: p -= p.grad * lr

(missing self in the learning rate update formula – the method still works because lr was declared as a global variable earlier.)

Papers

Notes and other resources

Talks and blog posts

15 Likes

This lecture deserves a How to Train your [Dragon] Model poster, someone more artistic than me please make it :slight_smile:

9 Likes

What is the refactoring process by which these nb_XX.py python files (and their code which is sometimes duplicated) get turned into things like fastai.text.data and so forth?

1 Like

The refactoring happens in the notebooks. We only turned them in a library when they looked nice and cosy.

1 Like

Is that process going to be covered as well?

I believe it is kind of covered right now =)

It’s what Jeremy does in each of those notebooks. Later in this lesson, you’ll see a training loop and CallbackHandler even better than what we have inside the library.

4 Likes

I don’t understand: would’nt it make more sense for the “leak” value to be the -Slope, rather than slope?

The slope of the left-hand side of the leaky ReLU is still positive.

2 Likes

Yes I didn’t catch Jeremy saying minus, but if he said minus, it was just a mistake. The slope on the negative size is still positive.

which notebook did Jeremy say we’re following? 02a…?

Yes, it’s the one.

What is gain for again?

2 Likes

It’s the multiplier you have to use to take into account the slope of your leaky really when initializing your layers.

4 Likes

Here is the notebook about sqrt(5) discussed in the lecture.

1 Like

according to Kaiming initializations?

1 Like

Yes, it’s the formula in the paper we mentioned last week.

1 Like

great, thanks for clarifying :+1:

Why uniform instead of normal distribution here?

4 Likes

That’s the default pytorch init.

1 Like