You’re correct that the transforms_side_on
flips the image left and right.
However, transforms_top_down
is more than just vertical flipping. It’s vertical flips + horizontal flips + every possible 90-degree rotation.
I believe the naming comes from the idea that some images you would capture from the side (like taking a photo of a cat or dog) vs some you take top-down (like satellite images, or food photos on instagram…). In the side-on case, realistic data augmentations would be flipping horizontally (except in the occasional case of the sidewise or upside-down hanging cat/dog…). In top-down imaging like with satellites, you can rotate and flip the image in every direction and it could still look like a plausible training image.
Here are some examples generated using the transform functions with cat/dog lesson1 images:
original cat image:
transforms_side_on, 12 examples:
transforms_top_down, 12 examples (note the mirror images + rotations):
Here’s a look at transforms.py:
transforms_basic = [RandomRotateXY(10), RandomLightingXY(0.05, 0.05)]
transforms_side_on = transforms_basic + [RandomFlipXY()]
transforms_top_down = transforms_basic + [RandomDihedralXY()]
class RandomDihedralXY(CoordTransform):
def set_state(self):
self.rot_times = random.randint(0,4)
self.do_flip = random.random()<0.5
def do_transform(self, x):
x = np.rot90(x, self.rot_times)
return np.fliplr(x).copy() if self.do_flip else x
class RandomFlipXY(CoordTransform):
def set_state(self):
self.do_flip = random.random()<0.5
def do_transform(self, x):
return np.fliplr(x).copy() if self.do_flip else x
Note that with both settings, there’s a bit of slight rotation and brightness adjustments included by default as well.