Dataset with each new class in a folder

Hi, I’m very new to Deep Learning and hence I’m still trying to figure things out.
I have completed the first lesson and I am now willing to perform an image classification of my own.
I have download a Dog breed classification dataset. the dataset contains 14 folders, each for a different breed (the name of the folder is the dog breed) and each folder contains images of that particular dog breed.

In the example which Jeremy uses in lesson 1, he has all the images in one folder and the file name of that image is the label identifier.

I have a couple of questions regarding my dataset. Firstly, can I work with multiple folders wherein each folder name is the class label. If yes, how do I do that?
Secondly, how do I split my data into training and validation set


?

In order to get the labels you can use the parent_label . This takes the name of the parent directory of an imagefile and uses that as the label for that image.

You have to plug this into your datablock by assigning this function to the get_y argument like get_y=parent_label.

For splitting, the simplest way would be to use GrandparentSplitter() by plugging it to the datablock as splitter=GrandparentSplitter()

You can create your own functions to do both of these tasks but since you are just beginning ro learn I would suggest you to use these built-in functions.

You can also refer to this tutorial.

2 Likes