With the iceberg Kaggle comp as an example (75x75 image sizes), oftentimes we encounter datasets with image sizes that are smaller than ideal making it difficult to take advantage of pre-trained nets etc. So I had an idea that would effectively result in creating a dataset of images that are bigger in size without using interpolation or any other resizing that might affect the original data.
The idea is actually really simple, to create new images by concatenating multiple copies of the original training images that are slightly different i.e. a series of augmented images that are pieced back together into a “mosaic” or “jacquard” of one image. For example, if you had a 75x75 image you could stitch together 4 blocks of that image (with transformations applied to each) to create one image that is 150x150 in size. As long as you followed the same process with the test images when running predictions, I believe it should work.
Does this idea sound reasonable at all? Or is it not really how things “should be done” in terms of training convolutional neural nets?
Would that possibly duplicate the thing you are trying to predict? So instead of having one iceberg in a picture, your model would also need to figure out what happens with a few fractured images of an iceberg.
This is sort of what they do when they mirror the edge of an image though so I don’t think the idea is bad, there just might have to be some thought put into how the copies are stitched together and what areas each of them contains.
Yes it would duplicate the item that the model is training to classify (which I think is a good thing in this case lol). Whether its binary or mult-class, the idea would apply the same. I think the effect should be similar to data augmentation actually because the model would be able to see “in one go” multiple variations of the same image.
Once again, the same process would need to be applied to the test set for it to work I think.
Keep in mind the “duplication” isn’t creating a larger dataset of duplicate images, its just enlarging each individual image with variations of the same image.