Lesson 6 - Official topic

danaludwig · April 22, 2020, 2:08am

How can you know if your architecture is working with just small models?

rachel · April 22, 2020, 2:09am

Just a reminder to like each other’s questions (as a way of upvoting) if there is a question you want to hear asked out loud

harish3110 · April 22, 2020, 2:09am

In the training process, does an oscillation of the accuracy going up and down between epochs indicate a bad training or is that normal? Assuming the validation loss and training loss is also dropping…

ilovescience · April 22, 2020, 2:09am

If you are struggling with the questionnaire or want to check your solutions, check the community-written solutions over here

sgugger · April 22, 2020, 2:09am

If you do it without using the necessary tricks (that are automatically done for you when you do to_fp16) yes. If you do it carefully (or let fastai do it carefully for you), no.

JPKab · April 22, 2020, 2:09am

I thought the learning rate variation was applied for earlier layers in the network, not earlier epochs???
Somebody please correct me if I’m mistaken.

marii · April 22, 2020, 2:09am

I think smaller models is just to make sure everything except for your architecture is working correctly.

avatar · April 22, 2020, 2:09am

we can use the same learning rate weather we use to_fp16 or not correct?

victor.vargas · April 22, 2020, 2:10am

Dumb question taking a simple_net from previous mnist is a resnet 18 just a

simple_net = nn.Sequential(
nn.Linear(2828,30),
nn.ReLU(),
nn.Linear(30,1)
nn.Linear(2828,30),
nn.ReLU(),
nn.Linear(30,1)
nn.Linear(28*28,30),
nn.ReLU(),
nn.Linear(30,1)
)

18 times per say or the linear functinons change and have an effect on the training?

zmd · April 22, 2020, 2:10am

Is there a way to adjust learning rate to be more sensitive to class imbalance in our dataset?

sgugger · April 22, 2020, 2:10am

No, that’s not ideal. You want your accuracy to be improving, not being bouncy, It’s usually the sign of a too-high learning rate.

ilovescience · April 22, 2020, 2:10am

Here is the learning rate finder paper:

sgugger · April 22, 2020, 2:10am

Yes, that usually does not change the ideal learning rate (in doubt, use a learning rate finder to be sure).

go_go_gadget · April 22, 2020, 2:11am

Yes, you’re right! I meant to say layer, not epoch.

Farah · April 22, 2020, 2:11am

All other thing being the same, is accuracy reduced when you use half percision/fp16?

sgugger · April 22, 2020, 2:12am

That needs to be addressed in your data (by oversampling) or in your loss function (penalizing more some classes). The learning rate can’t really do anything about that.

ram_cse · April 22, 2020, 2:12am

fp16 increase your accuracy.

giacomov · April 22, 2020, 2:13am

in my experience, it is normal. Also, sometimes the losses go up for one epoch and then start going down again. It might be the optimizer moving past a local minimum, for example: it needs to go down in the valley and then up before it can start going down the next valley (hopefully a deeper one)

arunslb123 · April 22, 2020, 2:13am

If fp_16() makes the training faster, and it improves the overall accuracy (slight regularisation), then we should keep this as a default

init_27 · April 22, 2020, 2:14am

Question from YT chat: Should you use fp16 training with the GTX series GPU (1080Ti/etc)