Add suffic to Datablock


For lesson 1 practice I try to translate a higher level API function to a DataBlock.

The higher level API is written like this:
dls = ImageDataLoaders.from_df(df, path=image_directory, valid_pct=0.2, seed=None, label_col='senior', folder=None, suff='.jpg',bs=64)

In the DataBlock I didn’t manage to find a place for the suffix (.jpg)

It looks like this currently:

block = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
                get_x=ColReader('img_name', pref=str(image_directory)+ os.path.sep),
                get_y=ColReader('senior', label_delim=' '),

dls = selfies.dataloaders(df)

Both functions make use of a pd dataframe, with a column for the image names that lacks the correct suffix. Any hints as where to implement to refference to the correct suffix?

ColReader receives a suff argument. Take a look at the ImageDataLoaders.from_df source code:

    def from_df(cls, df, path='.', valid_pct=0.2, seed=None, fn_col=0, folder=None, suff='', label_col=1, label_delim=None,
                y_block=None, valid_col=None, item_tfms=None, batch_tfms=None, **kwargs):
        "Create from `df` using `fn_col` and `label_col`"
        pref = f'{Path(path) if folder is None else Path(path)/folder}{os.path.sep}'
        if y_block is None:
            is_multi = (is_listy(label_col) and len(label_col) > 1) or label_delim is not None
            y_block = MultiCategoryBlock if is_multi else CategoryBlock
        splitter = RandomSplitter(valid_pct, seed=seed) if valid_col is None else ColSplitter(valid_col)
        dblock = DataBlock(blocks=(ImageBlock, y_block),
                           get_x=ColReader(fn_col, pref=pref, suff=suff),
                           get_y=ColReader(label_col, label_delim=label_delim),
        return cls.from_dblock(dblock, df, path=path, **kwargs)


Thanks, I already found this function, but I put the argument at the wrong place beforehand.