Fastai Datablock API splitting by textfile issue

AMusic · March 23, 2020, 7:38pm

Hallo guys,
the 2019 version of the course teaches how to split the camvid dataset into training and validation set based on a textfile (valid.txt)

In order to not split randomly (hence it would be way to easy) I would like ti split the sets as proposed by the dataset creators

The standard routine would be something like:

src = (SegmentationItemList.from_folder(path_img)
.split_by_fname_file(’…/valid.txt’)
.label_from_func(get_y_fn, classes=codes))

However, with the new Datablock API I am not able to do this.

I tried:
camvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
get_items = get_image_files,
get_y = label_func,
splitter=split_by_fname_file(’…/valid.txt’),
batch_tfms=aug_transforms(size=(120,160))
)

Which throws the error:
NameError: name ‘split_by_fname_file’ is not defined

Any idea how ti split by valid.txt using the new datablock API )

Best regards

muellerzr · March 23, 2020, 7:39pm

You should look at the course notebooks here:

It has all of them from the last part 1 lecture series (v3) in fastai2

AMusic · March 23, 2020, 7:51pm

Wow, thank you for the fast reply!

I certainly thought, that there must be a ready-to-use method.

The answer to my question is:

Answer:
Use the FileSplitter and provide the path to your textfile as follows:

camvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
get_items = get_image_files,
get_y = label_func,
splitter=FileSplitter(path/‘valid.txt’),
batch_tfms=aug_transforms(size=(120,160))
)