Lesson 6 - Official topic

Does dropping floating point number precision (switching from FP32 to FP16) have an impact on final loss?

4 Likes

Previously, it was suggested that using .to_fp16() actually decreased error rate? Am I misremembering that, or has fastai’s thinking changed?

I think it was something to do with the time spent optimizing numbers that didn’t make a difference in the end?

How can you know if your architecture is working with just small models?

6 Likes

Just a reminder to like :heart: each other’s questions (as a way of upvoting) if there is a question you want to hear asked out loud

3 Likes

In the training process, does an oscillation of the accuracy going up and down between epochs indicate a bad training or is that normal? Assuming the validation loss and training loss is also dropping…

If you are struggling with the questionnaire or want to check your solutions, check the community-written solutions over here

2 Likes

If you do it without using the necessary tricks (that are automatically done for you when you do to_fp16) yes. If you do it carefully (or let fastai do it carefully for you), no.

1 Like

I thought the learning rate variation was applied for earlier layers in the network, not earlier epochs???
Somebody please correct me if I’m mistaken. :slight_smile:

I think smaller models is just to make sure everything except for your architecture is working correctly.

we can use the same learning rate weather we use to_fp16 or not correct?

Dumb question taking a simple_net from previous mnist is a resnet 18 just a

simple_net = nn.Sequential(
nn.Linear(2828,30),
nn.ReLU(),
nn.Linear(30,1)
nn.Linear(28
28,30),
nn.ReLU(),
nn.Linear(30,1)
nn.Linear(28*28,30),
nn.ReLU(),
nn.Linear(30,1)
)

18 times per say or the linear functinons change and have an effect on the training?

Is there a way to adjust learning rate to be more sensitive to class imbalance in our dataset?

2 Likes

No, that’s not ideal. You want your accuracy to be improving, not being bouncy, It’s usually the sign of a too-high learning rate.

4 Likes

Here is the learning rate finder paper:

5 Likes

Yes, that usually does not change the ideal learning rate (in doubt, use a learning rate finder to be sure).

1 Like

Yes, you’re right! I meant to say layer, not epoch. :slight_smile:

All other thing being the same, is accuracy reduced when you use half percision/fp16?

That needs to be addressed in your data (by oversampling) or in your loss function (penalizing more some classes). The learning rate can’t really do anything about that.

1 Like

fp16 increase your accuracy.

in my experience, it is normal. Also, sometimes the losses go up for one epoch and then start going down again. It might be the optimizer moving past a local minimum, for example: it needs to go down in the valley and then up before it can start going down the next valley (hopefully a deeper one)

1 Like