I’m trying to train a model on data. The data is in train, valid, and test folders. in each folders there is a bunch of image files. The label is in the name of the file.
I succeeded in making a dataloader from one of the folders using ImageDataLoaders.from_name_func, but want to have the validation set come from the valid folder instead of being some pct of the train set.
Anyone knows how to approach this?
I need something similar to GrandparentSplitter I guess, but for the parent folder, not the grandparent.
Apparently, Jupyter notebooks don’t show that function but I found it in the source code using pycharm. Here is the missing piece:
def _grandparent_idxs(items, name):
def _inner(items, name): return mask2idxs(Path(o).parent.parent.name == name for o in items)
return [i for n in L(name) for i in _inner(items,n)]
So now I can probably make my ParentSplitter by removing one ‘parent’ call in the function above.
I don’t usually use ImageDataLoaders. I find Datablock api to be very easy and extremely versatile . In this case, you could do it with DataBlock api. See muellerzr vision course for examples in how to use ImageDataLoader, DataBlock and Dataset apis.