Changing bit depth (bits per pixel) for images / masks

From your explanation here…

…it sounds like I could also do the mask processing in get_y in the DataBlock, since the stuff that passes through there could be an np.array.

Case in point, right now I build a Pandas dataframe with columns containing:

  • paths to image files
  • lists of paths to mask files
  • dataset name (for stratification if I train on multiple datasets at once)

My DataBlock right now is:

def make_mask(row):
    f = ColReader("mask")
    # TODO: merge masks instead of random.choice()
    return random.choice(f(row))


src_datablock = DataBlock(
    blocks=(ImageBlock, MaskBlock),
    getters=[ColReader("image"), make_mask],
    splitter=TrainTestSplitter(stratify=src_df["dataset"].to_list(), random_state=42),
    item_tfms=Resize(size=input_image_size, method="squish"),
    batch_tfms=aug_transforms(),
)

If I understand your explanations in both threads correctly, then I could have a custom function in get_y which:

  • receives a list of paths to mask files (usually the list has just one item, occasionally has multiples)
  • grabs each file, converts it to np.array
  • if the format is not np.uint8 then converts it (the problem in this thread)
  • if there are multiple files, merge the masks (the problem in the other thread)
  • return the result to the DataBlock

I will try something along these lines.