How to use BCEWithLogitsLossFlat in lesson1-pets.ipynb

Is it possible or does it make sense to use BCEWithLogitsLossFlat in the lesson1-pets notebook to build a model which can make “reasonable” predictions on images which don’t contain one of the training classes. That is a model which when presented with a new breed of dog/cat should generate low “probabilities” for all classes in the way Jeremy described in Lesson 9.

I have tried using MultiCategoryBlock

path = untar_data(URLs.PETS)
path_img = path/'images'
fnames = get_image_files(path_img)
pat = r'\\([^\\]+)_\d+.jpg$'
label_func = RegexLabeller(pat)
seed = 2
item_tfms=RandomResizedCrop(460, min_scale=0.75)
b =64
batch_tfms=[*aug_transforms(size=224, max_warp=0), Normalize(*imagenet_stats)]
dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock()),
                   splitter=RandomSplitter(valid_pct, seed=seed),
dbunch = DataBunch.from_dblock(dblock, fnames, path=path, item_tfms=item_tfms,b=b,num_workers=num_workers,

but the resulting labels


are generated from the first letter of each class.

Do I have to use a dataframe?


MultiCategoryBlock expect a list of labels, so you have to change your label_func slightly to return something like [“cat”] (and not “cat”).


Thank you, what an easy fix, working nicely with both RegexLabeller() and parent_label().

Maybe someone help how to improve parent_label() for this task?
I’m really don’t know how to get [‘cat’] from parent_label function.

One way would be to copy the definition of parent_label(), rename it (to for example parent_labeler) with the changes suggested by sgugger ( [Path(o)] instead of Path(o)

 def parent_labeler(o, **kwargs):
     "Label `item` with the parent folder name."
     return [Path(o)]

and pass that to DataBlock

dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock()),
                   splitter=RandomSplitter(valid_pct, seed=seed),
1 Like

A better way would be to do what muellerzr does in this notebook using pipeline, repeated below.

def multi_l(l): return [l]

pets_multi = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
                 get_y=Pipeline([RegexLabeller(pat = r'/([^/]+)_\d+.jpg$'), multi_l]),
                 item_tfms=RandomResizedCrop(460, min_scale=0.75),
                 batch_tfms=[*aug_transforms(size=224, max_warp=0), Normalize.from_stats(*imagenet_stats)])
1 Like

Secrets out, you’re the reason I realized I could do this in the first place :wink: (I have this post bookmarked)

1 Like