With a resize zoom transform, what if you lose the object of interest in the process, effectively changing the label?
Iām not sure I understand what you mean. Thereās different classes for different kinds of transformation (TfmPixel, TfmCrop) and those classes have an order between them that is fixed in fastai ? What about different instances of TfmPixel transforms, how are they ordered between themselves?
If thatās such a common transformation, why not bake some of these transformations into the network architecture?
Doesnāt cropping the fish out of the image (his top left example) invalidate the appropriateness of the ātenchā label?
You have to be careful not to apply too much zoom. Yet again, this RandomResizeCrop is the most commonly used technique on Imagenet, with great success, and it often loses the object of interest.
The model ends up learning that a middle aged person with a smile must be holding a fish.
They all have the same order, because we never had the need to take care of that. Iām not sure what example you have in mind of a kind of data augmentation requires to go after or before another.
Typically the operations you want to bake into your architecture are ones for which you want to compute gradients. So thereās no point in spending time keeping track of gradients we might not need.
Another reason is that you might not want to make these transformation at test/inference time. You might want to train with the augmentations, but not necessarily use them when youāve deployed your model to production.
Another reason is that the transformations might depend on domain. We can use ResNet on both ImageNet and MNIST. However while we can flip ImageNet images horizontally, we probably donāt want to flip the digits in MNIST.
Didnāt know about torch.solve
Itās two weeks old, so itās understandable.
Can it also not be used to clean the data? Iām thinking about the nature conservancy fisheries monitoring kaggle competition.
Right, not at inference. But Iām just kind of wondering out loud that the intuition of the point of doing his data augmentation is to make your training data go further, sort of artificially ātake more photosā (because āmore data always winsā).
Has someone written a tutorial or example of using the tensorboard integration? Iāve found the code but very little documentation of this functionality.
What about the black pixels in these transforms where you change the perspective, or tilt the picture in some way to where itās no longer fills up the the original rectangle?
The motivation has been explained at length during part 1, thatās why Jeremy skipped it tonight. Itās to artificially have more training data, yes.
We usually use reflection padding to fill the black pixels.
Is perspective shift augmentation fast enough given Imagenet constraints discussed above?
If I want to use a particular transformation offered by opencv, how do i go about it? Is it better to convert that particular transformation into PIL in all cases?
Right, I donāt have anything specific in mind for transforms. Itās just that weāve been using _order
a few times now (for callbacks too for example) and I was wondering how transparent and usable it is. But if itās implemented differently in fastai from whatās shown here and/or the order actually doesnāt matter that much I may be thinking too much about it
No. The person that added the functionality doesnāt have the time to document it right now, so if you want to volunteer to do that, it would be much appreciated.