How to load images as grayscale

Hi, the get_transforms method in vision returns collection of train transforms and valid transforms. How to add a transform like converting the images to grayscale or to rgb for the dataset that has mixed type of images?

Hi !

You can use thecontrast transform with a contrast of 0 to turn images to grayscale. I’m not really sure if this answers your question though, could you expand a bit on what you’re looking for ?

1 Like

I have omniglot dataset, and when I load the data with ImageDataBunch.from_folder the images are by default loaded with 3 channels but I want them to have a single channel, how to do this? and like in pytorch is there a way to write custom transforms like

def custom_transform(img):
      # some transform on img
      return img

transform = transforms.Compose([ custom_transform, transforms.ToTensor()])

The default behavior is due to the open_image function using ‘RGB’ as the default convert mode. You can pass a different convert mode when creating your dataset.

1 Like

Hi I am able to pass convert_mode argument directly to ImageItemList but not to ImageDataBunch but the output size of image is (64, 1, 105, 105). How to make it to size 56 (64, 1, 56, 56)?

data = (ImageItemList.from_folder(path, convert_mode='L')

and the fastai library has enough transformations to use directly, but for some case if I want to write a custom transformation how can I do that?

I tried adding transformations and then noticed that the image size is changed
In short:

data = ImageDataBunch.from_folder(path, size=56)

gives input shape as (64, 3, 105, 105)

data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms(), size=56)

gives input shape as (64, 3, 56, 56)

Hi guys,

In order to have grayscale images, in addition to ImageItemList.from_folder(path, convert_mode = 'L'), you have alto to change the defaults.cmap from viridis, which is the default, to default.cmap = 'binary'.

Try this, and it will change the numbers of channels from 3 to 1.


That’s because that is what the size keyword argument does. When you pass size as an argument when creating an ImageDataBunch, it will resize all of your images to a square with length size.

I think he was asking why if he passes just size nothing happens, while if he passes the transforms along with size, the images get properly resized. I’m curious too…

Oh, I see. Hmm…

I think that size is a parameter for the transformer: note that if you use the db api you get to pass it directly as you call the transformer. Probably, as you call imagedatabunch with size, it passes size to the (internally called) transformer.

Here’s what I found:

class ImageDataBunch(DataBunch):
    "DataBunch suitable for computer vision."
    _square_show = True

    def create_from_ll(cls, lls:LabelLists, bs:int=64, val_bs:int=None, ds_tfms:Optional[TfmList]=None,
            num_workers:int=defaults.cpus, dl_tfms:Optional[Collection[Callable]]=None, device:torch.device=None,
            test:Optional[PathOrStr]=None, collate_fn:Callable=data_collate, size:int=None, no_check:bool=False,
            resize_method:ResizeMethod=None, mult:int=None, padding_mode:str='reflection', 
        "Create an `ImageDataBunch` from `LabelLists` `lls` with potential `ds_tfms`."
        ds_tfms, resize_method = _prep_tfm_kwargs(ds_tfms, size, resize_method=resize_method)
        lls = lls.transform(tfms=ds_tfms, size=size, resize_method=resize_method, mult=mult, padding_mode=padding_mode, mode=mode)
        if test is not None: lls.add_test_folder(test)
        return lls.databunch(bs=bs, val_bs=val_bs, dl_tfms=dl_tfms, num_workers=num_workers, collate_fn=collate_fn, 
                         device=device, no_check=no_check)

Any of the from_ methods called on ImageDataBunch will call this class method, create_from_ll. Even though size=56 is being passed to ImageDataBunch, it’s actually internally being passed on to ds_tfms. The default for ds_tfms is actually None, so if no transforms are being passed into the ImageDataBunch, no transforms will happen, and thus the size parameter will actually go nowhere – it won’t be applied, as far as I can tell.

I heard Jeremy say in one of the lessons that kwargs in the fastai library are meant to be passed on. This would be an example, I think.

As imagined. I think this should be rectified. I tried to pass None to the transformer (just the size), but doing that, the actual size remains unchanged.

I was having trouble getting my ImageDataBunch to return single-channel grayscale images as well. What I ended up doing is creating the ImageDataBunch then setting the convert_mode on the train, valid, fix, and test ImageItemLists afterwards (if present):

data = ImageDataBunch.from_df(path, df, valid_pct=0.2, bs=81)
for itemList in ["train_dl", "valid_dl", "fix_dl", "test_dl"]:
    itemList = getattr(data, itemList)
    if itemList: itemList.x.convert_mode = "L"

And that gives single channel tensor I was looking for:


Edit: Realized setting x directly on the DataBunch only affected train's ImageItemList so changed the code above to enumerate through them instead.


@yeldarb Thanks for your answer. I needed something like this and it worked like a charm.

Hello all,

Had a follow up question. If we want to load data as grayscale (using type ‘L’) so we have a single channel, how do we use a pre-trained model that was trained in grayscale RGB images (3 channel). I can start a new thread as well.

Note: Grayscale RBG images are such that each channel is identical, so we aren’t giving the model more information.

Unfortunately, this is not working for me. Whatever I do, one_batch() always returns a tensor with 3 channel dimensions.

where i have to do it? please explain

When you do:

from import * 

Among others, it will import an python object called defaults.

Just change it as follows (at least as far as I remember, since there’s been a long time since I haven’t used it):

defaults.cmap = ‘binary’
#to check it

Let me know if it works or not!

@NimaC I had the same problem, and it turns out it’s the .normalize(imagenet_stats) which is converting it back to 3 channel. Remove that and @yeldarb’s wonderful hack should work.

This side-effect is discussed here: .normalize(imagenet_stats) will silently add channels