Lesson 2 In-Class Discussion ✅

(Scott H Hawley) #519

So looking at Jeremy’s FileDeleter output, it seems that the size=224 parameter to ImageDataBunch rescales the images to 224, rather than cropping them to that size.

Is that correct, or does this depend entirely on the output of get_transforms()?

(And what’s the default if ds_tfms=None?)

(Paula Alves) #520

I would not like to crop, but padding is a good idea

(Stef) #521

My understanding is that the probability would add up to 1 in the end, so one still has some sort of classification going on. But I would like to definitely classify it as neither.

(Francisco Ingham) #522

Not a widget for this yet. You could display the top losses, get their filenames and move them manually.

(Francisco Ingham) #523

For the widget the images are rescaled (not cropped) to 300, 250.

(Ariel Gamiño) #524


model = torch.load("/yourpath/model.pt")


Hey for

‘download_images(‘data/iphone/iphone.txt’, ‘data/iphone/iphone’, max_pics=200)’

I get ‘NameError: name ‘download_images’ is not defined’

I imported fastai and just updated the library a while back. How to fix this issue?

(hector) #526

Why does the plot_losses plots val loss after one epoch rather than from first iteration itself?

(Thomas Sandmann) #527

When I train a learner for one epoch and then repeat that step (e.g. train for another epoch). Is that equivalent to training for 2 epochs right away?

(Surag Gupta) #528

If FileDeleter has the option to feed in a path, and it displayed images within subfolders (classes) in that path with the class labels, it would help in weeding out incorrectly labeled images. This is especially useful when you’re downloading images from Google using the javascript method.


The validation loss is only computed at the end of each epoch, that’s why.

(Francisco Ingham) #530

Please refer to the FAQ

(Kevin Bird) #531

Would it make sense to get ride of all images that have a high loss or is this breaking some rules? Like, Anything that is really wrong, gets thrown out or thrown into a new model where a new model trains on the new images

(Manan Sanghi) #538

That is indeed surprising. How come DL is so robust to overfitting when your model has such high capacity?


I can imagine setting up an overfitting situation where the validation set is way different than the training set

(hector) #543

oh ok! so training loss is for each iteration whereas val loss is at epoch level in the graph!! shudnt training loss be at epoch level as well to make a fair comparison?

(Radu Spineanu) #544

Why is it important for the training loss be smaller than the validation loss? What’s the theory behind that?

(Nate) #547

Assuming the images are not really messed up, this usually wouldn’t be a good idea. Usually you’re interested in predicting correctly on unknown images, so if you just throw out all the images it gets wrong, it will probably suck at predicting images like that in the future.

(Kay) #548

If you have trained on a bunch of data and all of a sudden see new kinds of data, say orange bears, the trained model would be an over-fit correct? is there a way to overcome such a case? meaning: how generalize can I make my training process?

(Thomas Sandmann) #549

You are learning the features on the labelled examples. When confronted with novel, previously unseen images, some of the learned features will be useful - but others won’t be. So you don’t expect to do as well on novel images as on the training data.