Custom batches of specific images

Hey there,

I just finished fastai part 1 and wanted to try Kaggle competitions. I am now working on the image classification problem using DenseNet. I have one idea but not really sure how to implement it.

For training, I want to apply a custom normalization for each batch which will contain specific images (defined by filename). In essence, it should be like training with batch_size =1 and applying custom normalization for each image.

One way to do this would be to have dict or list that defines batches and which images can go that batch. I then sample indices from that batch to access images.

Another approach is to loop through all images (train and test), normalize them and then save. I would prefer not to use this approach.

I would appreciate if someone can direct me to the right approach.

I believe the easiest way would be to use a custom collate function, you can set collate_fn in databunch, it should be a function that receives a batch of whatever you are giving it after applying the image transforms and should return the tensor data to feed the network.

The default is just return the data attribute from the images.

Another way is to make a custom transformation for a batch, which you can set with dl_tfms attribute, I don’t remember but I think the difference is that dl_tfms runs before image transforms, so if your case allows it it could also be done here.

Thanks for reply, I will look into collate_fn, seems like what I need. .