Chapter 7 - Further research Implementing TTA

kofi · August 8, 2020, 9:14pm

The further research questions at the end of chapter 7 includes one where you are to use the fastai documentation to build a function that crops an image to a square in each of the four corners, then implement a TTA method that averages the predictions on a center crop and those four crops.

I’ve built the function to get the crops (below) but I’m stuck at implementing the TTA method. How do I get to change the already defined transforms in the datablock to use my TTA function? I thinking of defining the function as a transform but the problem is the output is 5 different crops of the image instead of one for the normal image augmentation transforms.

tta

nuggimane · August 28, 2020, 9:52am

Hey man, I’m pretty much at this exact point right now, have the cropping function but trying to figure out how to implement the TTA method. Is there any direction you can point me in that will help?

kofi · September 4, 2020, 3:16pm

Hey @nuggimane,

I ended up implementing the crops as transforms and modifying the batch transforms of the validation dataloader. You can find it here.

github.com

asiedubrempong/fastai-experiments/blob/master/tta.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TTA\n",
    "---\n",
    "This notebook contains the solution to this research question in chapter 7 of the fastai book:\n",
    "\n",
    "*Use the fastai documentation to build a function that crops an image to a square in each of the four corners, then implement a TTA method that averages the predictions on a center crop and those four crops. Did it help? Is it better than the TTA method of fastai?*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai2.vision.all import *"

This file has been truncated. show original

Hope it helps.

nuggimane · September 7, 2020, 8:30am

Thanks I ended up doing something similar at the time but used item transforms instead

Arpan · October 5, 2020, 4:51pm

Hey, I wanted to ask if you used this method on Imagenette? I tried my method and the accuracy dropped heavily

kofi · October 10, 2020, 9:27pm

hey @Arpan, I also remember a drop in accuracy with my implementation. I guess the default approach in fastai is the best to use in practice.

JackByte · December 5, 2020, 10:37pm

Thanks @kofi for posting your solution. I finally was able to implement tta based on your solution.

But I am curios: how can I make use of the shipped augmentation transforms while using the tta method? You can provide item_tfms and batch_tfms. But I get the error “batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘fastai.vision.core.PILImage’>” when I run the following code.

learn.tta(n=5, batch_tfms=RandomResizedCrop(size=123))

muellerzr · December 5, 2020, 10:44pm

The issue here is RandomResizedCrop is meant for PILImages, not TensorImages (which our data are tensor subclasses by the time we get there) as we can see here in RRC’s encodes:

github.com

fastai/fastai/blob/master/fastai/vision/augment.py#L264


            nh = int(round(math.sqrt(area / ratio)))
            if nw <= w and nh <= h:
                self.cp_size = (nw,nh)
                self.tl = random.randint(0,w-nw), random.randint(0,h - nh)
                return
        if   w/h < self.ratio[0]: self.cp_size = (w, int(w/self.ratio[0]))
        elif w/h > self.ratio[1]: self.cp_size = (int(h*self.ratio[1]), h)
        else:                     self.cp_size = (w, h)
        self.tl = ((w-self.cp_size[0])//2, (h-self.cp_size[1])//2)
    def encodes(self, x:(Image.Image,TensorBBox,TensorPoint)):
        res = x.crop_pad(self.cp_size, self.tl, orig_sz=self.orig_sz,
            resize_mode=self.mode_mask if isinstance(x,PILMask) else self.mode, resize_to=self.final_size)
        if self.final_size != self.size: res = res.crop_pad(self.size) #Validation set: one final center crop
        return res
# Cell
class RatioResize(DisplayedTransform):
    'Resizes the biggest dimension of an image to `max_sz` maintaining the aspect ratio'
    order = 1
    def __init__(self, max_sz, resamples=(Image.BILINEAR, Image.NEAREST), **kwargs):

So the solution in your case would be to either include it in an item_tfm (which that is what it is) or use the GPU equivalent of RandomResizedCropGPU

JackByte · December 8, 2020, 7:46pm

Thanks a lot for your hint @muellerzr

I’ve tried both approaches using fastai version 2.1.8
learn.tta(n=5, batch_tfms=RandomResizedCropGPU(size=123))
learn.tta(n=5, item_tfms=RandomResizedCrop(size=123))

But I got TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘fastai.vision.core.PILImage’> in both cases

muellerzr · December 8, 2020, 7:58pm

From what I can tell in the code, you would need to pass all the item transforms you want, as it overrides it. So IE:

item_tfms = [RandomResizedCrop(mysize), ToTensor()]

Or

batch_tfms = [RandomResizedCropGPU(mysize), IntToFloatTensor(), Normalize.from_stats(*imagenet_stats)]

JackByte · December 8, 2020, 10:29pm

Thanks a lot @muellerzr. I was able to add RandomResizedCrop based on your help. But now I have some issues using the Zoom transform. Can you give me a hint, on how you would try to find the root cause?

You can see the full code here (and the error in the last cell)

It seems like, that

the images still have the same size, so Zoom(size=224,…) didn’t do the trick
Zoom is somehow creating a fifth dimension ( grid with sizes [3, 224, 224, 2])

Zoom is listed under Affine and coord tfm on the GPU, is it not possible to use it in item_tfms then? I’d like to use it there, because I believe I would get most of this TTA approach that way (even a larger RandomResizedCrop in item_tfms would cut away some information).

I checked the source code

I am not sure if affine_mat is the place where the grid is changed to [3, 224, 224, 2]
And if it is the root cause, I am not sure how to change it to the shape that I need

consolidatedprime · December 24, 2020, 9:58pm

Is this snippet equivalent to the TTA implementation in fastai?

def ttaify(x: PILImage, k: int) -> TensorImage:
    x = ToTensor().encodes(x)
    
    c, h, w = x.shape

    top_left     = x[:, : k              ,  :k ]
    top_right    = x[:, : k              ,  -k:]
    bottom_left  = x[:,  -k:             ,  :k ]
    bottom_right = x[:,  -k:             ,  -k:]
    centre       = x[:, (h-k)//2:(h+k)//2, (w-k)//2:(w+k)//2]

    crops = [top_left, top_right, bottom_left, bottom_right, centre]
    avg_img = torch.stack(crops).double().mean(0)
    
    return avg_img

JackByte · December 27, 2020, 9:44pm

This looks pretty good. But as far as I know the “four corners and center TTA” is not implemented in fastai. It was a task for us to see if we can build our own TTA method.

If you want to that, I would suggest that you refactor your code to

Run learn.get_preds on the valid set for each variation of “four corners and center”
Get the metrics (e.g. accuracy) of each run
Calcualte an average over all runs
Run Learner.tta and compare your TTA result to the fastai TTA result

sambit · February 17, 2021, 3:54pm

Hello,

Chapter 7 uses CrossEntropyLossFlat as the loss function instead of nn.CrossEntropyLoss.

When do we use the ‘Flat’ version of the loss function? Is it related to the architecture of the model?