Cannot read Pascal 2007 into a Fastai2 dataloader

I am trying to port a Pascal 2007-based object detection application to fastai2, but it fails reading the Pascal data set into the dataloader.

Have created a dictionary of images and targets, a DataBlock, and a dataloader

img_y_dict = dict(zip(tot_img_names, tot_truths))

data = DataBlock(blocks=(ImageBlock, BBoxBlock, BBoxLblBlock),
getters=[noop, lambda o: img_y_dict[][0], lambda o: img_y_dict[][1]],

dlrs = data.dataloaders(path)

It fails in the getters

in (o)
2 get_items=get_image_files,
3 splitter=RandomSplitter(),
----> 4 getters=[noop, lambda o: img_y_dict[][0], lambda o: img_y_dict[][1]],
5 #getters=[truth_data_func],
6 item_tfms=Resize(sz),

KeyError: ‘008673.jpg’

It fails with, at least, these images: “008673.jpg”, “005939.jpg”, “004236.jpg” in the test.jason, and “008359.jpg” in the valid.json folders. I know that "008359.jpg” has a header, but has missing the “segmentation” entry. The other image/target tuples may have errors as well.

The error says: KeyError: ‘008673.jpg’; I have inspected the img_y_dict dictionary:

for k in img_y_dict.keys():

None of the failing tuples appear in the dictionary. They have been filtered out (by the get_annotations?)

1 Like

This may be able to help you. It’s slightly outdated (databunch is dataloader, etc)

I’m going through this now and seeing this too, let me investigate @joseadolfo :slight_smile: The issue is we are getting images from the folder rather than from our json document. Here is how I went about adjusting it:

getters = [lambda o: path/'train'/o, lambda o: img2bbox[o][0], lambda o: img2bbox[o][1]]
def get_train_imgs(noop):
  return imgs
pascal = DataBlock(blocks=(ImageBlock, BBoxBlock, BBoxLblBlock),
dls = pascal.dataloaders(path/'train')

I’m working on an actual notebook now but I’ll upload it once it’s done :slight_smile:

Thank you for the helpful information. I believe there is a bug particular to reading the Pascal dataset. When I switch to reading COCO, the DataBlock and dataloaders work perfectly

It’s not, it has to due with how the folder structure is set up. If we look at get_image_files's length inside of the train folder you can see it’s much larger than the length of the train images

@joseadolfo if you need it, I have the start of a RetinaNet notebook (It’s still training but wanted you to have something end to end) :slight_smile:

Thanks, very kind of you. I wrote an end-to-end object detection notebook in Fastai1 that applies Google’s AutoAugment data augmentation policy. Results were very impressive. I am now porting the notebook to Fastai2 to benchmark performance. Aside from my problems reading Pascal, I am struggling trying to gain access to the image/target tuple at the mini batch level. Something similar to the dl_tfms parameter in Fastai1 databunch. If you are interested, the Fastai1 notebook is at: (


Absolutely! :slight_smile: I don’t have all the time to work on it but I can try to help. Where about in the code is this? (where you’re trying to grab the mini-batch)

I believe you’re talking about accessing the transforms on a batch level. That is the batch_tfms parameter. Or after_batch if you’re adjusting it after the fact @joseadolfo :slight_smile: Let me know if you need help from there (but that should give a hint on where to start I hope?)

1 Like

Thanks very much.

Hi Zach. Thank you for your great notebook. After training when I call learn.show_results() it gives me “TypeError: object of type ‘int’ has no len()” error!

Yes. show_results and predict will not work. I’m not very familiar with Object Detection, though I am aware there is a thread on it for fastai v2 where they handled some issues

Thank you!

Thank you for your great tutorials. Here’s the updated link for the object detection tutorial:

1 Like