Unified Data Augmentation Techniques for Semantic Segmentation


In order to solve real life necessities, we are using Computer Vision techniques, such as Image Classification, Object Detection or Semantic Segmentation.

In particular, Semantic Segmentation is a technique whose aim is to classify every pixel in an image into a semantic class, such as “person”, “plant” or “background”.

However, this kind of techniques need large bunchs of data. Researchers pointed this out and investigated different ways to obtain more data. At first, they applied geometrical transforms to the images that they had in their datasets (such as flips or rotations) and, only recently, we are applying Image Mixing Techniques to obtain new images.

Unfortunately, those techniques already developed were defined to solve a particular problem, thus they are scattered on the web and their usage differs from one to another (they were developed by different people).

In this project, we aim to collect all those techniques and unify their use in an easy and intuitive way.

Semantic Segmentation Augmentations (SSA)

There are two main techniques in order to mix our images together: CutOut and CutMix.

CutOut techniques allow us to select a region inside the image and drop the information inside the region. CutMix techniques allow us to replace the information, instead dropping it, by a region obtained from another image.

In order to generalise those techniques, we can agree that there are two main steps:

  1. Select the region to be used.
  2. Replace the information within the region in some way.

We have define a module that allow us to define new components to solve those two steps and combine them in a easy way. For example, we can use a technique that randomly selects the region and the original CutOut technique, or we can use a technique that selects a region taking into account the limitation that the region must be inside the image boundaries and then replace this region with a randomly selected one from another image.

Furthermore, we have adapted and implemented ten different techniques before used in Image Classification, Object Detection or Binary Class Image Segmentation problems to be used here, in potentially multiclass tasks.


The use of this module is pretty simple:

  1. Download and import the semantic-segmentation-augmentations package from pypi.
  2. Import the technique that you are going to use. For example, the CarveMix.
  3. Define your model and add the technique as a FastAI Callback.
  4. Train your model as always.

Usage example

pip install semantic_segmentation_augmentations
from semantic_segmentation_augmentations.holesfilling import CarveMix
learner = unet_learner(dls, resnet18, cbs = CarveMix()).to_fp16()

It is worth to say that these techniques are customizable by means of the number of the selected regions or the probability to apply the technique. You can find more information about how to customize them in the documentation.

Furthermore, you can extend these general techniques in order to define your own ones.


Data Augmentation Techniques are needed to train models without needing giant datasets, in this project we have developed a tool to facilitate this process as much as possible to Semantic Segmentation tasks.

Please, feel free to take a look to the GitHub repository in order to obtain more information. You can also see the documentation of the tool to more detailed examples and explanations.

Finally, you can find a deeper explanation and all the references of this work at the published paper.