Item transforms and batch transforms

Hi guys,

I wanted to know a few things about item transforms and batch transforms, and couldn’t see this documented anywhere. If it is documented somewhere and I’ve missed it, I apologise in advance.

1) Is there a good reason to use item transforms over batch transforms (or vice-versa) in the context of wanting to produce more, varied training data? ie, Horizontally flipping images, playing with brightness and warp, etc.

2) When using transforms, does the training set effectively multiply in size? That is, will the network be trained on both the original versions of all of the training data (A), and versions with transforms applied (B), so a total training set of A+B?

3) This sort of relates to 2) above, and I believe I know the answer already through looking at the code, but I want to be certain: Regardless of whether we’re using item transforms or batch transforms, are these transforms fixed once applied, so therefore when we train for X epochs, the training data is the same through all of these epochs? ie, The transforms aren’t re-applied to the training data randomly in the next epoch (ie, File X.png has brightness of 0.7), then again in the next epoch (ie, File X.png has brightness of 0.3), etc.

Thanks for taking the time to answer my questions.

1 Like

Item transforms are for collating and preparing for a batch and they’re run on the CPU. Batch transforms are applied after everything is resized and batched up and done on the GPU

No, the set is still the same, just the input image is transformed. Nothing additional is made.

Yes, that should be the case (unless you have say RandResizeCrop which would still be random each time it’s called as the transforms are done each time you call a batch)

1 Like

Thank you for getting back to me!

This was basically already my understanding (thanks for clarifying), but I meant, in the context of my use-case, would batch transforms be essentially the same as item transforms, but run on the GPU? And therefore I could use either, but batch is preferred for speed?

My images are already all the same size.

Got it!

Followup question:

Is there currently no way to achieve this (creating new versions of the data that are transformed, along with the originals) – Apart from RandResizeCrop, etc.? Or is there no advantage to doing this?

ie, A training dataset of x, plus n transformed versions of this same dataset, so the final training data is x+n large, and therefore contains more data to train on.

(Any reason not to do this? Beside it potentially not having been implemented yet).