ImageDataBunch.normalize does not work

I created image_data with ImageDataBunch.from_folder(…).normalize(), but when I looked at the image_data.train_ds[0][0].data, I found that the normalize() function does not work. Specifically,
image_data_0 = ImageDataBunch.from_folder(…)
image_data_1 = ImageDataBunch.from_folder(…).normalize(imagenet_state)
image_data_2 = ImageDataBunch.from_folder(…).normalize(([0.5,0.5,0.5],[0.5,0.5,0.5]))

and they all created the same data, that is image_data_0(1, 2).train_ds.data are completely same. WHY? Does normalize function not work? fastai.version is 1.0.51.

The normalization is done at the batch level (it’s quicker to do it on all the images at once). It’s normal your dataset stays the same.

Thanks so much.

Hi, apologises for digging this out.
I am having a similar problem though I do ask use .one_batch() to pull out the array.

Basically my databunch looks like that

data =  (ObjectItemList.from_folder(PATH, include = ['train'])
                   .split_by_valid_func(get_valid)
                   .label_from_func(labelling_func)
                   .transform(get_transforms(do_flip=False, max_rotate=None),size = 224)
                   .databunch(bs = 64,collate_fn=bb_pad_collate)
                   .normalize()
    )

I pull the data and verify the mean over the samples like that

x,y = data.one_batch()
x.mean((0,2,3))

Which does not yield a (0,0,0) value. Besides if I remove the .normalize() in the databunch building block, it yields the same result.
What am I missing ?

---------------------------[SOLVED] -------------------------

So several things:

  • .one_batch() has the keyword arg denorm set to True per default. Setting it to False yields finally the expected result. It seems a bit missleading (at least in my case) but I guess it serves some purpose.

  • Initially I thought that the learner used the .one_batch() from the databunch. Actually it directly calls the train_dl iterator of the databunch, which yields the expected result also. You can reproduce that like this:

    x,y = next(iter(data.train_dl)); x.mean((0,2,3))

1 Like