Lesson 2 In-Class Discussion ✅

cstorm125 · October 31, 2018, 3:03am

Once you unfreeze and retrain with one cycle again, if your training loss is still higher than your validation loss (likely underfitting), do you retrain it unfrozen again (which will technically be more than one cycle) or you redo everything with longer epoch per the cycle?

Mauro · October 31, 2018, 3:04am

Your training loss being lower than your validation loss is normal behavior. You are likely overfitting

edwardjross · October 31, 2018, 3:05am

I’d often find running fit_one_cycle over multiple epochs, the accuracy and valid_error would start off terrible (worse than the end of the previous call of fit_one_cycle!) in the first epoch and get better over subsequent epochs. That makes me think they’re not the same.

cstorm125 · October 31, 2018, 3:05am

Sorry typo. I mean the other way around.

rachel · October 31, 2018, 3:05am

I will ask the 3 highly voted questions when Jeremy finishes this explanation

weiwei · October 31, 2018, 3:05am

Why do we need to fit_one_cycle with a certain learning rate before unfreezing. Why don’t we instead unfreeze directly.

angelinayy · October 31, 2018, 3:05am

For human assigned labels, what if human may have different criteria for assigning labels. as a result the labels may be “mislabeled” (but not necessarily wrong,could be different opinions), which may confuse the model; which then affect the model accuracy. What to do with those cases if i dont want to delete them?

rachel · October 31, 2018, 3:06am

Being able to correct incorrect labels is a feature we plan to add in the future

Jess · October 31, 2018, 3:07am

“Matrix multiplication” = “dot product”?

wdhorton · October 31, 2018, 3:07am

I think the explanation for this is that the optimizer resets, and it’s using momentum, so by the end of the previous training it had a better “idea” of which direction to head, whereas when you restart it starts off by heading in a less optimal direction.

marcmuc · October 31, 2018, 3:08am

No, absolutely not.
The cycle of one cycle follows this structure. The cycle length of the total move up and down is accros the number of epochs. So if you do one epoch and then run it again, it will do the “triangle” twice. @edwardjross, this also explains why the learning rate at the beginning is higher than at the end of the previous cycle!

Sorry, @sandmann, answered somehow to the wrong question, this was in answer to you.

dhananjay014 · October 31, 2018, 3:08am

Also, for metrics, instead of error rate can use F1 score instead of error rate… as @wdhorton mentioned, sampling should be done, based on the class distribution

nikhil.ikhar · October 31, 2018, 3:08am

Never thought like that simple equation y = mx + c can be written as matrix dot product.

marcmuc · October 31, 2018, 3:09am

not only the momentum, but also the LR itself!

cstorm125 · October 31, 2018, 3:09am

Sorry I meant underfitting I just made a mistake about the losses. Sorry, morning with no coffee over here.

miwojc · October 31, 2018, 3:10am

Question: Image size, size parameter in ImageDataBunch, that is set to 224. Is it better that is you will get lower error when images are higher resolution (it will take more time, bs smaller)? or the image resolution need to be 224 for resnet34?

dotkay · October 31, 2018, 3:10am

is there a no.of.epochs learner like lr learner?

Slonik · October 31, 2018, 3:10am

For imbalanced classes question: would you balance you validation set or check on unbalanced validation set?

gjohn · October 31, 2018, 3:11am

Jeremy’s answer on handling unbalanced data eg 200 real bears and 50 teddy bears “just try it, it always just works fine”… is that only when starting with a well-trained net, or is also when starting fresh?

aidan.davis · October 31, 2018, 3:11am

If consciousness arrises from complex enough data processing, at what point do you give up on ML training and obsess over AI sentience?