Do fastai data loaders require images and labels to be files on disk?

rbavery1 · March 24, 2022, 12:50am

Looking for an answer to this question if anyone has a moment: Image Segmentation on COCO dataset - summary, questions and suggestions - #5 by austinmw

“Any chance you’ve figured out how to use the json data directly instead of needing to create the intermediary mask images?”

I have mask annotations compressed in COCO format in RLE format (sparse arrays). They don’t need to be saved as png files because the sparse representation for these masks fits in json format and takes up less space than many png files. I haven’t found examples or documentation indicating that fastai data loaders accept a get_y function that can return the array data directly, but wanted to see if there was some way to create a fastai data loader that does support a get_y function that returns the array data rather than a filename. Any help or pointers are very appreciated!

Some mroe detail:
I’ve looked over this tutorial mainly, it seems like this example shows that MaskBlock only expects filenames for the get_y function: Data block tutorial | fastai
I’d like to convert the instance segmentation annotations in my COCO dataset to semantic segmentation format in memory without intermediate saving of label files.

matdmiller · March 24, 2022, 5:21am

Have you tried writing a custom get_y function that returns a np ndarray from your decoded rle json files? The MaskBlock should take the results of the get_y function and pass them to the PILMask.create function which accepts a number of types including a numpy ndarray.

It’s been a while since I worked with RLE encoding, but from what I remember doing the RLE decoding was slower than opening a png file.

Hopefully this helps!

github.com

fastai/fastai/blob/62e608e9350858e8a10292b997482ee5d9c072fa/fastai/vision/data.py#L69

      
        
                for i in range(2):
                    ctxs[i::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs[i::2],range(max_n))]
                return ctxs
            
            
# Cell
            def ImageBlock(cls=PILImage):
                "A `TransformBlock` for images of `cls`"
                return TransformBlock(type_tfms=cls.create, batch_tfms=IntToFloatTensor)
            
            
# Cell
            def MaskBlock(codes=None):
                "A `TransformBlock` for segmentation masks, potentially with `codes`"
                return TransformBlock(type_tfms=PILMask.create, item_tfms=AddMaskCodes(codes=codes), batch_tfms=IntToFloatTensor)
            
            
# Cell
            PointBlock = TransformBlock(type_tfms=TensorPoint.create, item_tfms=PointScaler)
            BBoxBlock = TransformBlock(type_tfms=TensorBBox.create, item_tfms=PointScaler, dls_kwargs = {'before_batch': bb_pad})
            
            
PointBlock.__doc__ = "A `TransformBlock` for points in an image"
            BBoxBlock.__doc__  = "A `TransformBlock` for bounding boxes in an image"

github.com

fastai/fastai/blob/ab154927696338741e59e0ffc4774777c4a9781c/fastai/vision/core.py#L98

      
        
                return im.convert(mode) if mode else im
            
            
# Cell
            def image2tensor(img):
                "Transform image to byte tensor in `c*h*w` dim order."
                res = tensor(img)
                if res.dim()==2: res = res.unsqueeze(-1)
                return res.permute(2,0,1)
            
            
# Cell
            class PILBase(Image.Image, metaclass=BypassNewMeta):
                _bypass_type=Image.Image
                _show_args = {'cmap':'viridis'}
                _open_args = {'mode': 'RGB'}
                @classmethod
                def create(cls, fn:(Path,str,Tensor,ndarray,bytes), **kwargs)->None:
                    "Open an `Image` from path `fn`"
                    if isinstance(fn,TensorImage): fn = fn.permute(1,2,0).type(torch.uint8)
                    if isinstance(fn, TensorMask): fn = fn.type(torch.uint8)
                    if isinstance(fn,Tensor): fn = fn.numpy()
                    if isinstance(fn,ndarray): return cls(Image.fromarray(fn))

rbavery1 · March 24, 2022, 10:03pm

Ah, thanks for this! I didn’t think to check the PILMask.create function, I just assumed it only took paths. I’ll give this a go, thanks a bunch for the suggestion.