Applying lesson 1: what are the specs for building an image set?

jedbanger · June 17, 2020, 2:24pm

Hi! Sorry if this has been asked before, but I couldn’t find an answer.

I’m building an image dataset for applying lesson 1 learnings.

Already collected images quite easily thanks to Fastclass

But how many images per class should I collect? What’s a general guideline?
And what’s the suggested train/validation split? I planned on using ImageDataBunch.from_folder and from what I see in the docs, my dataset should look like this:

    path\
      train\
        clas1\
        clas2\
        ...
      valid\
        clas1\
        clas2\

But I couldn’t find how many images should be put in each category (train/valid)

muellerzr · June 17, 2020, 5:57pm

Until enough is needed (not helpful I know). In reality I start with between 50-100 per class and go up fro there.

I use a random split of 80%/20% between train and validation.

jedbanger · June 17, 2020, 9:36pm

Thanks Zachary I have around 70 of each.

Tried it and got a 35% error rate (after fine tuning).

So I guess I would need more pictures for better results.

muellerzr · June 17, 2020, 10:34pm

Possibly, depending on how hard the problem is that could be a good score too