Lesson 11 discussion and wiki

With a resize zoom transform, what if you lose the object of interest in the process, effectively changing the label?

3 Likes

Iā€™m not sure I understand what you mean. Thereā€™s different classes for different kinds of transformation (TfmPixel, TfmCrop) and those classes have an order between them that is fixed in fastai ? What about different instances of TfmPixel transforms, how are they ordered between themselves?

If thatā€™s such a common transformation, why not bake some of these transformations into the network architecture?

Doesnā€™t cropping the fish out of the image (his top left example) invalidate the appropriateness of the ā€œtenchā€ label?

1 Like

You have to be careful not to apply too much zoom. Yet again, this RandomResizeCrop is the most commonly used technique on Imagenet, with great success, and it often loses the object of interest.

3 Likes

The model ends up learning that a middle aged person with a smile must be holding a fish.

6 Likes

They all have the same order, because we never had the need to take care of that. Iā€™m not sure what example you have in mind of a kind of data augmentation requires to go after or before another.

Typically the operations you want to bake into your architecture are ones for which you want to compute gradients. So thereā€™s no point in spending time keeping track of gradients we might not need.

Another reason is that you might not want to make these transformation at test/inference time. You might want to train with the augmentations, but not necessarily use them when youā€™ve deployed your model to production.

Another reason is that the transformations might depend on domain. We can use ResNet on both ImageNet and MNIST. However while we can flip ImageNet images horizontally, we probably donā€™t want to flip the digits in MNIST.

2 Likes

Didnā€™t know about torch.solve :smile:

Itā€™s two weeks old, so itā€™s understandable.

3 Likes

Can it also not be used to clean the data? Iā€™m thinking about the nature conservancy fisheries monitoring kaggle competition.

Right, not at inference. But Iā€™m just kind of wondering out loud that the intuition of the point of doing his data augmentation is to make your training data go further, sort of artificially ā€œtake more photosā€ (because ā€œmore data always winsā€).

Has someone written a tutorial or example of using the tensorboard integration? Iā€™ve found the code but very little documentation of this functionality.

What about the black pixels in these transforms where you change the perspective, or tilt the picture in some way to where itā€™s no longer fills up the the original rectangle?

The motivation has been explained at length during part 1, thatā€™s why Jeremy skipped it tonight. Itā€™s to artificially have more training data, yes.

2 Likes

We usually use reflection padding to fill the black pixels.

1 Like

Is perspective shift augmentation fast enough given Imagenet constraints discussed above?

1 Like

If I want to use a particular transformation offered by opencv, how do i go about it? Is it better to convert that particular transformation into PIL in all cases?

Right, I donā€™t have anything specific in mind for transforms. Itā€™s just that weā€™ve been using _order a few times now (for callbacks too for example) and I was wondering how transparent and usable it is. But if itā€™s implemented differently in fastai from whatā€™s shown here and/or the order actually doesnā€™t matter that much I may be thinking too much about it :wink:

No. The person that added the functionality doesnā€™t have the time to document it right now, so if you want to volunteer to do that, it would be much appreciated.