Lesson 2 In-Class Discussion ✅

Any image copyright issues while doing that?

10 Likes

One can use this repo to download images and create a dataset…(Completely Automated)
It work’s out of the box

1 Like

Is training on this dataset (where you pick images of off google as opposed to an open source dataset) legal?

What is the minimum number of images of each class you would usually need for your classifier to work reasonably well?

13 Likes

How do I get these collapsible headings in notebooks, they are nice. Anyone knows the right extension name ?

Generally +500 images per category is enough. You can always add more images and check how much this increases performance.

5 Likes

collapsible headers, in nb_extensions :wink:

2 Likes

Probably falls under fair use. As long as you don’t sell anything that they were used to create. <I am not a lawyer, not legal advice, you are responsible, etc.>

Do you have to balance out for the deleted images on the classes? Do they all should have the similar amount?
I found out some of my classes had great quality photos and others not so much.

3 Likes

Thanks! What happens if you use less than that? Does that cause overfitting (or some other problem that I don’t know about :wink: )

~500 in the training set or altogether?

When I run the javascript my chrome, ones a window for a sec and shuts it. No download option is given. What to do??

2 Likes

what is image_net statistics there… i dint gets it reference in libraries

1 Like

Does the fastai library support k-fold cross validation? Or only % holdout?

7 Likes

Did my thesis on fair use. This is legal.

Our use is “transformative” which is key factor.

12 Likes

Can we get a list of the filenames for the validation set?

1 Like

The np.random.seed isn’t being passed into the DataBunch. Is this adjusting some global variable?

You will get poor performance since your model will not be able to learn enough to effectively differentiate between your categories.

It is the global seed of numpy.

1 Like

Per category