"Replace" item in batch as augmentation

Hi All,

Does anybody know how one could “replace” an item in the batch with a different item as an augmentation?
I am working in the remote-sensing domain, where it is important to ensure that the model is robust against seasonal changes.
One simple augmentation technique is to consider the exact same location over multiple timesteps as augmented views of the first image.
Say, we have to pictures of a bridge: One in winter and one in summer. Now, I would only sample from all images in winter, but would like to randomly “augment” the winter to the summer “view”.

From a usability perspective, it would be nice, if it would be a “normal” Transformer, just so one can quickly add/remove the augmentation where the other ones are used. This transformation should be at the very beginning of the pipeline, as all the other transformations would be “overwritten” by the replacement operation.
But how would you actually implement the transformation?
Or would you go a different route, to reduce unnecessary data loading?

Would you try to infer the index of the current item with val2idx or from the dataloader?
Or get the hash of the current item and then “look-up” files from different time steps?
This would require the transformer to also include the function to do the type-transform.
I am not sure if I am missing something and if there is a “nicer” way to do it.

I would appreciate any input. :slight_smile:

EDIT: A different approach would be to subclass from TensorImage, or monkey-patch, some metadata information into the tensor about the current item, like the name/path for example.

1 Like

Hi, have you figured out a way to do this? My first thought went to something like you suggested, add the path information to the item and use that for the transforms…

Hi, sadly not really.
I will take some time until I require this augmentation technique, as I have some other things that need to be implemented first, but when I get to it and find a good solution, I will give an update :slight_smile:

But, I also haven’t had a new idea either.
The issue with the monkey-path idea is that the metadata is lost after the batch collation step.
So there is no way to use that augmentation style in the batch-augmentation step/phase.
That’s why I would first look into the index-based approach.