Lesson 2 In-Class Discussion ✅

For me, it seems to require CMD-OPTION-C on macOS (Safari 12.0) not CMD-OPTION-J to pull up the JavaScript Console.

2 Likes

How to address issues with imbalanced data? aka some classes have very few photos compared to others?
Aka data augmentation that is weighted by ratios of class imbalances or something like that?
thanks

8 Likes

When there are unsuitable images (e.g. drawings instead of photos) in the training dataset. How can I best remove them? Should I?

What’s the metrics=error_rate line for?

Yes, I’d like to know how to handle images that are inherently not squared, say all of them will be very rectangular

1 Like

When doing a lr_find(), is it actually training the model?

6 Likes

Is the size of validation-set always 20% or does it depend upon your data size ?

2 Likes

This is going to be explained in a few minutes

It should be removed manually.

No, it’s trying out different LRs to help find the best via visualization.

1 Like

It’s a mock training with a various range of learning rates. But the original model is loaded after, so it doesn’t change the weights.

1 Like

looks like 3e-3 would have been better

what if curve is seen flat for many iterations unlike this one where it goes high in just few iterations

what is the y axis in the lr_find graph?

7 Likes

Question : Is training loss and error rate same thing computed on training data and test data ?

1 Like

Error rate

Karpathy said validation sets should be made carefully, Rachel also has an article about it. when Is it ok to randomly split data ?

6 Likes

Just on the training set.

Not sure, but @william has an intersting approach to curating scraped datasets. Have a look here:

6 Likes

You can send a ‘size’ parameter to ImageDataBunch that will crop and pad your images to get them to be of your desired size.

1 Like