[Solved] Question on `split_by_valid_func` (data block API)

Context:

lesson3-head-pose.ipynb

What I did:

I tried to run this cell line by line:

data = (PointsItemList.from_folder(path)
        .split_by_valid_func(lambda o: o.parent.name=='13')
        .label_from_func(get_ctr)
        .transform(get_transforms(), tfm_y=True, size=(120,160))
        .databunch().normalize(imagenet_stats)
       )

To see how .split_by_valid_func(lambda o: o.parent.name=='13') works, I dug into the source:

…and ran it in a separate cell:

a = PointsItemList.from_folder(path)

val = [(i,o) for i,o in enumerate(a.items) if lambda o: o.parent.name=='13']

val returns:

[(0, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00239_rgb.jpg')),
 (1, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00492_rgb.jpg')),
 (2, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00570_rgb.jpg')),
 (3, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00411_rgb.jpg')),
 (4, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00331_rgb.jpg')),
 (5, PosixPath('/home/paperspace/.fastai/data/biwi_head_pose/10/frame_00204_rgb.jpg')),
...
]

What I don’t understand:

The lambda function specifies that we want items whose o.parent.name=='13' returns True.

Why do I still get items from directories other than '13' in val?

In fact, the length of val is equal to the length of all data a.items:

Where did I go wrong?

Figured it out.

To answer my own question, this is where it went wrong:

val = [ ... if lambda o: o.parent.name=='13']

Problem: the lambda function wasn’t called. It was merely defined. So the list comprehension behaved like the if condition doesn’t exist.

To fix it, define the lambda function elsewhere and pass it into list comprehension. Don’t forget to call the function! lol

Imgur

1 Like