Need help with : ImageDataLoaders.from_name_func

Hello,

I was working on this method:

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

I have gone through the documentation, however, I still have a few questions:

Question-1
Executing the following command
print(path)
gives the output as :
/root/.fastai/data/oxford-iiit-pet/images

Is this folder on colab cloud? How can I access it? I need to see the data.

Question -2

dls = ImageDataLoaders.from_name_func(
        path, get_image_files(path), valid_pct=0.2, seed=42,
        label_func=is_cat, item_tfms=Resize(224))

According to fastai documentation : In this funtion we are generating labels using “label_func” from files names, hence, using the function “from_name_func” besides creating a validation set & resizing images for standization.

What is the significance/ role of “path” parameter (first parameter) when we can get it from “get_image_files(path)” itself

Thanks

if you are executing this code on the collab, definitely this folder /root/.fastai/data/oxford-iiit-pet/images will be in a google cloud server on which your notebook is running. To access to can open terminal which is collab pro feature or you can navigate from the sidebar.

when you see the documentation: Vision data | fastai

ImageDataLoaders.from_name_func (path , fnames , label_func , valid_pct =0.2 , seed =None , item_tfms =None , batch_tfms =None , bs =64 , val_bs =None , shuffle =True , device =None )

it says : Create from the name attrs of fnames in path s with label_func and here fname is get_image_files(path)

writing function with only the first two parameters:

ImageDataLoaders.from_name_func(**path**, get_image_files(**path**))
My question was the second parameter (get_image_files) is fetching images recursively has a “PATH” parameter which points to the images. Then what is the role/use of “path” as the first parameter in the above function…why do we even need it

Hello!

Because by definition (follow up to here) the class methods that are called by this ‘helper function’ uses path='.' (current working directory – try printing dls.path), but the images are on a different directory. So passing path updates the working directory to make sure everything is on the same path. The second argument only collects the files in path as @girijesh said. Remember that ImageDataLoaders is a basic wrapper around many mid-level APIs so as to make everything easy. These things are better handled by DataBlock, DataLoaders etc.

You can do path.ls() to list the files, or index into path.ls() to display the image:

idx = 0
img = PILImage.create(path.ls()[idx])
img

Cheers :slight_smile: