Getting Images and Captions in a Databunch

Hi all,
I’m trying to make an imaging caption model and am using the work here as a guide. However, I don’t think that the author used fastai as much as possible and it’s mostly using raw pytorch. The method used for getting it into a pytorch dataset is to create a custom dataset object (which returns a (image,caption) tuple when indexed) which is passed into a dataloader with a custom collate-pad function.
I’m pretty sure that this should be able to be put in a databunch using some kind of custom Itemlist, like in the docs, but I have no idea how to get started doing this for a case where the input is an image and the output is text. How should I go about this?

1 Like