Item_tfms vs batch_tfms

Hi all,
What is the difference between item_tfms and batch_tfms? I am confused whether all images are first cropped out (say randomly) for every epoch, using item_tfms, and then batch_tfms is applied to a batch of such cropped image? or is a batch chosen first, transformed using batch_tfms, and then all are cropped out using item_tfms? What is the correct order of processes?

In general, what is the idea behind having batch transforms and item transforms as two separate processes?

4 Likes

See my last two slides here:

Let me know if it’s still unclear

11 Likes

My understanding is as follows.
item_tfms happens first, followed by batch_tfms. This enables most of the calculations for the transforms to happen on the GPU, thus saving time.
The first step item_tfms resizes all the images to the same size (this happens on the CPU) and then batch_tfms happens on the GPU for the entire batch of images.
If it weren’t to happen in two steps, then all these calculations would have to be done on the CPU, which is slower.
Would that be a correct explanation, @muellerzr?

10 Likes

Yup exactly :slight_smile: basically item transforms preps everything to the same size, then we can push everything else to the GPU for batch!

3 Likes

You can also add augmentation operations that need to occur on CPU to item_tfms, for example the one’s from Albumentations!

3 Likes

But why not have a single function to do this for us? as in fastaiv1? I mean, how is this better? Why are two separate functions required?

1 Like

Because we can separate how each one works and where it’s applied. In fastai v1 everything was CPU bound as well.

Again, two seperate ones are needed because the item transforms are all done on the CPU and batch are all on the GPU

And we need these CPU-bound item transforms so everything can be put into a batch and be allowed to be pushed to the GPU

2 Likes

So, is the advantage in terms of speed? or in terms of flexibility? or something else? And how is it advantageous over fastai v1? And what can fastai2 do that fastai1 couldnt do (in terms of transformation method)?

1 Like

It’s extremely faster. You’re utilizing the GPU so it’s multitudes faster. Along with this, our Pipelines also allow us to control when and how certain transforms are done (to your discretion)

2 Likes


I Hope some visual will help.

5 Likes

Just to clarify: both item_tfms and batch_tmfs take place each time an item and a batch is called?

So for example, if I RandomCrop and image to a smaller size than the original image, I will likely get different parts of the images in different steps in training?

Can anyone answer this question? It seems reasonable that batch_tfms is done in each epoch. Is that also the case for item_tfms? Or is item_tfms a one-time thing?

I agree that the distinction between item_tfms and batch_tfms beyond the former running on the CPU and the latter with batches on the GPU is unclear. One might naively ask why all the transforms can’t be run on the GPU (my understanding is that item_tfms mainly make sure the input data is of the same size so that it can be collated in batches, but this isn’t clear in the documentation).

Fastai is as much (if not more) about simplicity than efficiency and v1 was much more straightforward wrt data augmentations so this change needs further explaining. For example, there is a class FlipItem (for item_tfms) and a class Flip (for batch_tfms). This is confusing.

7 Likes

Thank you It’s help me a lot.

1 Like

So far I do understand that item_tfms happens on the CPU and batch_tfms on the GPU. My question is, if in case of images, does this increase the amount of different images used for training? For example if I use aug_transforms(max_rotate=180) an image is rotated between 0 and 180 degrees. How can I control the amount of rotations for one image? For example I want every image to be used with 5 different rotations, how can that be achieved? Or do I just repeat paths in my training textfile with random rotation.

1 Like

I’m learning this myself but seems like batch_tfms function is called once for each batch when the batch of data is loaded by the dataloader.

I believe there’s some random amount of transforms/rotation already built into aug_transforms. So if you want to train on 5 different rotation/augmentation of the image, simply train the model through 5 epochs

Late reply but looks like that’s correct! According to the solution to Q.17 of Lesson 2 (Deployment) of the 2022 course release:

item_tfms are transformations applied to a single data sample x on the CPU. Resize() is a common transform because the mini-batch of input images to a cnn must have the same dimensions. Assuming the images are RGB with 3 channels, then Resize () as item_tfms will make sure the images have the same width and height.
batch_tfms are applied to batched data samples (aka individual samples that have been collated into a mini-batch) on the GPU. They are faster and more efficient than item_tfms . A good example of these are the ones provided by aug_transforms() . Inside are several batch-level augmentations that help many models.

2 Likes

slightly related with your question although not entirely, where are the docs for these types of concerns? I’ve found this link vision.transform | fastai that talks about the transforms however it says that this is a deprecated version, and the latest version doesn’t seem to have any docs regarding transform, or have i missed it?

1 Like