Lesson 2: further discussion ✅

Forseti · March 12, 2020, 7:46am

Right now I moved on to image segmentation but I will put an issue into the GitHub if I find an example. And thanks for building this!

rgarcia · March 15, 2020, 10:29am

On some paper (update: https://arxiv.org/pdf/1803.09820 ) I saw a picture that explained it very well. I’ll paste some here that provide the same message.

https://zahidhasan.github.io/img/bias_variance6.png

https://forums.fast.ai/uploads/default/original/2X/5/57b24adcaf41ec93767e692094f00369a4f2e6fb.jpg

https://www.researchgate.net/profile/Cristiano_Ballabio/publication/222344717/figure/fig3/AS:325003603136513@1454498305120/Training-error-and-cross-validation-error-as-a-function-of-model-complexity.png

Too little training: underfitting
Too much training: overfitting
But that is the case if the network is “deep” enough to learn from the data. With “deep” I mean of a suitable architecture to be able to perform well.

If the network is too small, it won’t be able to predict accurately.
If the network is too big, you will need to apply regularization as it will be very capable of just “memorizing” the training set.

(Just my view)

rgarcia · March 15, 2020, 10:41am

lr_find
It spits out the validation error (prediction error against the validation set).

If there is no validation set, then there is no error to calculate.

Most probably you loaded the images without specifying a % to be used for validation. We do that (0% validation) on the stage when we want to ‘clean’ the dataset, so we work with all the images, not just training.

wcneill · March 30, 2020, 7:40pm

How many images are needed to train a robust cnn?

Jeremy mentions “less than you would think” in the video lecture. The snippet of js code that scrapes google images gives me 80 photos. Even with the most careful tagging using -keywords tags in my search, anywhere between 25 and 50 percent of my returned results are ‘noise’ that I have to manually prune out.

My dataset per classification ends up being 30-80 pictures per class, after manual cleaning. Is that good enough?

I’ve read elsewhere on the web that you need at least 1000 images per classification. That’s a lot.

wcneill · March 30, 2020, 8:09pm

And a second question:

What is the reason for running the cleaning process AFTER training the model instead of before?

Wouldn’t it be a better trained model if we started with good data?

akshay772 · April 2, 2020, 12:08pm

In lesson 2, we set learner.precompute to false only for the last layers.
Instead will it always be best to first unfreeze and then precompute all weights with augmentation for better results?

meanpenguin · April 2, 2020, 7:20pm

As Jeremy always says, Try it and see…

I have had good results with 100 images per category.

meanpenguin · April 2, 2020, 7:23pm

One reason is so that we can use the trained model to help in the cleanup process.
e.g. Show me all the Cat1 images
Then it’s easier to identify images which are not in Cat1

wcneill · April 2, 2020, 10:30pm

Is an ImageDataBunch a DataLoaders object?

Trying to sort the difference between a DataBunch, DataBlock, and DataLoader. Thanks.

Also, what should I do if I see that my training and validation loss are decreasing very nicely, but error rate is staying the same?

gump · May 4, 2020, 9:38am

i’m trying to clean the image dataset i created. I’m working with google colab so when i arrive this part:

# Don't run this in google colab or any other instances running jupyter lab.
# If you do run this on Jupyter Lab, you need to restart your runtime and
# runtime state including all local variables will be lost.
ImageCleaner(ds, idxs, path)

It clearly says “don’t run this in colab”
Then how im i suposed to clean my dataset if i can’t remove the images?

Thank you!

lifeskipper · May 7, 2020, 9:37pm

When try to predict class using loaded learner(classifier for black/grizzly/teddy bear, I can’t seem to get the prediction showing me the actual class, instead I’m getting this:

.
I’ve tried both on gradient & colab, can’t figure out why.

kimw · June 21, 2020, 12:27am

If you haven’t already done so, you will need to learn ‘exception handling’.

joedockrill · July 10, 2020, 2:57pm

i have a problem with the javascript console approach on mobile devices.

basically i’m stuck abroad for the next 2 months and i’m working from an android tablet with a bluetooth keyboard which is fine except the mobile browsers (at least chrome and firefox) don’t have js consoles as they only understand remote debugging.

does anyone know of a work-around to this which doesn’t involve me going to find access to a pc?

misham · July 12, 2020, 2:27am

Have you tried something like https://jsfiddle.net/ ?

joedockrill · July 12, 2020, 5:14am

that looks like a useful toy but it’s not going to help me with this particular problem since i can’t get at the html on mobile either. i’m just knocking up a quick scraper notebook instead for now. thanks anyway!

LeTorres · July 17, 2020, 5:32am

Hello, I have the same problems , someone here in another discussion wrotte this code with work with me in safari , chrome and firefox, and solve the problem of not date. I hope it works wih you too ,

var urls=Array.from(document.querySelectorAll(’.rg_i’)).map(el=> el.hasAttribute(‘data-src’)?el.getAttribute(‘data-src’):el.getAttribute(‘data-iurl’));
var hiddenElement = document.createElement(‘a’);
hiddenElement.href = ‘data:text/csv;charset=utf-8,’ + encodeURI(urls.join(’\n’));
hiddenElement.target = ‘_blank’;
hiddenElement.download = ‘myFile.txt’;
hiddenElement.click();

Sandeeppk · January 4, 2021, 9:48am

Hello,

I am quite confused as to where should I put an error which I am getting while running the first code.
from fastai.vision.all import *
path = untar_data(URLs.PETS)/‘images’

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)

error is in learn.fine_tune(1) and the error is : anaconda3\lib\site-packages\torch\utils\data\dataloader.py in _try_get_data(self, timeout)

can you please guide me where do I post this error