Lesson 2 In-Class Discussion ✅

wdhorton · October 31, 2018, 3:07am

I think the explanation for this is that the optimizer resets, and it’s using momentum, so by the end of the previous training it had a better “idea” of which direction to head, whereas when you restart it starts off by heading in a less optimal direction.

marcmuc · October 31, 2018, 3:08am

No, absolutely not.
The cycle of one cycle follows this structure. The cycle length of the total move up and down is accros the number of epochs. So if you do one epoch and then run it again, it will do the “triangle” twice. @edwardjross, this also explains why the learning rate at the beginning is higher than at the end of the previous cycle!

Sorry, @sandmann, answered somehow to the wrong question, this was in answer to you.

dhananjay014 · October 31, 2018, 3:08am

Also, for metrics, instead of error rate can use F1 score instead of error rate… as @wdhorton mentioned, sampling should be done, based on the class distribution

nikhil.ikhar · October 31, 2018, 3:08am

Never thought like that simple equation y = mx + c can be written as matrix dot product.

marcmuc · October 31, 2018, 3:09am

not only the momentum, but also the LR itself!

cstorm125 · October 31, 2018, 3:09am

Sorry I meant underfitting I just made a mistake about the losses. Sorry, morning with no coffee over here.

miwojc · October 31, 2018, 3:10am

Question: Image size, size parameter in ImageDataBunch, that is set to 224. Is it better that is you will get lower error when images are higher resolution (it will take more time, bs smaller)? or the image resolution need to be 224 for resnet34?

dotkay · October 31, 2018, 3:10am

is there a no.of.epochs learner like lr learner?

Slonik · October 31, 2018, 3:10am

For imbalanced classes question: would you balance you validation set or check on unbalanced validation set?

gjohn · October 31, 2018, 3:11am

Jeremy’s answer on handling unbalanced data eg 200 real bears and 50 teddy bears “just try it, it always just works fine”… is that only when starting with a well-trained net, or is also when starting fresh?

aidan.davis · October 31, 2018, 3:11am

If consciousness arrises from complex enough data processing, at what point do you give up on ML training and obsess over AI sentience?

KarlH · October 31, 2018, 3:11am

The model isn’t evenly trained. There’s the resnet backbone, which has been extensively trained on all of Imagenet, and the head, which we add on for our classification purpose and is entirely untrained.

If you trained the entire model at once, you could get large errors coming from the untrained layers back propagating through the model and messing up your nicely pretrained weights.

Training with the backbone frozen allows us to only trained the untrained layers in the head. Once those layers have converged somewhat, we unfreeze the entire model and continue training.

Mauro · October 31, 2018, 3:12am

Its OK. I just saw what you meant. It was good question.

sandmann · October 31, 2018, 3:12am

“The cycle length of the total move up and down is accros the number of epochs.”
Thanks a lot - that’s important to understand!

simonw · October 31, 2018, 3:13am

When you first access models.resnet34 you get a progress bar while it downloads the model to disk, which is the thing that confused me.

Benudek · October 31, 2018, 3:13am

if you dont get that url download file after below, check the adblocker.

urls = Array.from(document.querySelectorAll(’.rg_di .rg_meta’)).map(el=>JSON.parse(el.textContent).ou);
window.open(‘data:text/csv;charset=utf-8,’ + escape(urls.join(’\n’)));

sgugger · October 31, 2018, 3:13am

That’s the pretrained models. Here during inference, you’ll load your own model.

gamino · October 31, 2018, 3:14am

learn.be_patient()

evan.xiong · October 31, 2018, 3:14am

I am interested, what is the right practice in fastiai, to load a set videos and sample e.g. every 3 second, to from a bunch series of images. So that I can apply Resnet34 (time-distributed) on each image and apply an LSTM on the next layer, for building a video classifier?

vedder · October 31, 2018, 3:14am

For unbalanced data: What do you do if the class you care most about is a rare class. An example is identifying skin lesions where the most common benign class is way more frequent than melanoma?