Wiki: Lesson 2

daveluo · January 15, 2018, 4:44am

You’re correct that the transforms_side_on flips the image left and right.

However, transforms_top_down is more than just vertical flipping. It’s vertical flips + horizontal flips + every possible 90-degree rotation.

I believe the naming comes from the idea that some images you would capture from the side (like taking a photo of a cat or dog) vs some you take top-down (like satellite images, or food photos on instagram…). In the side-on case, realistic data augmentations would be flipping horizontally (except in the occasional case of the sidewise or upside-down hanging cat/dog…). In top-down imaging like with satellites, you can rotate and flip the image in every direction and it could still look like a plausible training image.

Here are some examples generated using the transform functions with cat/dog lesson1 images:

original cat image:

orig_cat2

transforms_side_on, 12 examples:

side_on_cat

transforms_top_down, 12 examples (note the mirror images + rotations):

top_down_cat

Here’s a look at transforms.py:

transforms_basic    = [RandomRotateXY(10), RandomLightingXY(0.05, 0.05)]
transforms_side_on  = transforms_basic + [RandomFlipXY()]
transforms_top_down = transforms_basic + [RandomDihedralXY()]

class RandomDihedralXY(CoordTransform):
  def set_state(self):
    self.rot_times = random.randint(0,4)
    self.do_flip = random.random()<0.5

  def do_transform(self, x):
    x = np.rot90(x, self.rot_times)
    return np.fliplr(x).copy() if self.do_flip else x

class RandomFlipXY(CoordTransform):
  def set_state(self):
    self.do_flip = random.random()<0.5

  def do_transform(self, x):
    return np.fliplr(x).copy() if self.do_flip else x

Note that with both settings, there’s a bit of slight rotation and brightness adjustments included by default as well.