I noticed that if we pass to a Learner object a DataLoader with no proper normalization (i.e. if we are using a pre-trained resnet we should use imagenet stats normalization), the learner apparently modify (I say apparently because I didn’t find it on the library code but I ran an example to check it) the proccessing behaviour of the data loader.
from fastbook import *
from fastai.vision.all import *
path = untar_data(URLs.PETS)
pets = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=using_attr(lambda x : 'cat' if x[0].isupper() else 'dog', 'name'),
item_tfms=Resize(224),
batch_tfms=[])
dls = pets.dataloaders(path/"images")
def gen_sample():
t = tensor( Image.open('./cat.jpg').resize((224,224)) )
return dls.test_dl([t]).one_batch()[0].squeeze(0).cpu()
l1 = gen_sample()
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(1)
l2 = gen_sample()
After running the code above
torch.allclose(l1,l2)
Returns ‘False’
normalize = Normalize.from_stats(*imagenet_stats)
torch.allclose(normalize(l1.cuda()).cpu(),l2)
Returns ‘True’
Even when it’s nice that the library prevent us doing something wrong (as using a pre-trained resnet without imagenet normalization), I think it would be better to throw an error and explain that fact instead of silently correct the mistake.
This behavious is also weird becuase when I pass a DataLoader object to a Learner I understand is for ‘using’ it, not for eventually modifying its behaviour.
What do you think?