Lesson 7 - Official topic

sgugger · April 28, 2020, 5:25pm

Note: This is a wiki post - feel free to edit to add links from the lesson or other useful info.

Resources

Edited video (private)
fastbook chapter 8
fastbook chapter 8 questionnaire solutions - feel free to contribute!
fastbook chapter 9
fastbook chapter 9 questionnaire solutions - feel free to contribute!
Non-beginner discussion

Links from lesson

fill me with interesting stuff

Other useful links

Notes by @Lankinen

ilovescience · April 29, 2020, 1:33am

Second to last lesson

rachel · April 29, 2020, 1:37am

check here if audio is clear
check here if it is not

0 voters

Edit: Thank you all, it seems the audio is clear for most people

pinaki · April 29, 2020, 1:41am

seems like both weight decay and learning rate can be used to address overfitting. How do we use them together ? And how do we find the optimal values of both when used in the same equation.

sgugger · April 29, 2020, 1:41am

The learning rate doesn’t do any regularization, it’s the size of your updates.

Raymond-Wu · April 29, 2020, 1:42am

They’re separate hyperparameters so you can specify them together. I don’t know how to find the optimal value for weight decay but learning_rate you can use lr_finder()

pinaki · April 29, 2020, 1:43am

is LR only to decrease the number of epochs to achieve optimal results ?

jwuphysics · April 29, 2020, 1:43am

Try re-running the learning rate finder with different wd parameters (see code). For example, try anywhere between [1e-5, 1e-4, 1e-3, 1e-2, 1e-1].

sgugger · April 29, 2020, 1:44am

No. If you don’t have a proper value, you won’t train at all. Look again at the past lessons to refresh your mind, this has been covered in chapter 4 and 5.

harish3110 · April 29, 2020, 1:45am

How do we set the weight decay’s hyper parameter? Can we do something like a random search or grid search approach or is there a better way to set it?

sgugger · April 29, 2020, 1:45am

There is no proper way we have found yet. So trying various values is still the best solution.

dcooper01 · April 29, 2020, 1:45am

What’s the advantage of creating our own embedding layer over the stock PyTorch one? I think I missed that

chengwliu · April 29, 2020, 1:47am

What’s the difference between PyTorch’s nn.Module and fastai2’s Module?

sgugger · April 29, 2020, 1:47am

fastai’s Module removes the need to call super().__init__(), which you need to call at each nn.Module init.

init_27 · April 29, 2020, 1:49am

This is wrong Shawshank redemption should be on top!

dcooper01 · April 29, 2020, 1:50am

It’s ALMOST as good as Lawnmower Man 2

jwuphysics · April 29, 2020, 1:50am

Ah, E.T. My favorite romance film

zmd · April 29, 2020, 1:51am

how does sample size of the ratings affect our learned bias ranking?

barnacl · April 29, 2020, 1:51am

did we cover how n_factors is selected?