Fastai v2 vision

Please do not at-mention me when there are other people who can also look and help you.

In this case no one (including me) can because you only posted one line of the error message without the full stack trace. Also include the things you tried to debug this so that anyone reading can understand the issue a little bit better. This is the way to efficiently get help on the forum, not at-mentioning the administrators.

2 Likes

Really sorry for that. Will post all the debugging steps I performed and complete stack trace here.

So I made some progress with this issue. I’m trying to process multiple columns from Dataframe

class TensorContinuous(TensorBase): pass
class RegSetup(Transform):
    "Transform that floatifies targets"
    def encodes(self, o): return TensorContinuous(o).float()
    def decodes(self, o:TensorContinuous):return TitledStr(o.item())

pipe = Pipeline([RegSetup])
temp = df[['age', 'parity']]
p = pipe(temp); p

output:

TensorContinuous([[43.,  1.],
        [43.,  1.],
        [43.,  1.],
        ...,
        [49.,  9.],
        [49.,  9.],
        [49.,  9.]])

Now, the next thing I want to do is normalize these columns with their respective stats

class Norm(Transform):
    "Normalize/denorm batch of `TensorImage`"
    order=99
    def __init__(self, mean=None, std=None, axes=(0,2,3)): self.mean,self.std,self.axes = mean,std,axes

    @classmethod
    def from_stats(cls, mean, std, dim=1, ndim=4, cuda=True): return cls(*broadcast_vec(dim, ndim, mean, std, cuda=cuda))

    def setups(self, dl:DataLoader):
        if self.mean is None or self.std is None:
            x = dl.one_batch()
            self.mean,self.std = x.mean(self.axes, keepdim=True),x.std(self.axes, keepdim=True)+1e-7
            print(self.mean, self.std)

    def encodes(self, x:TensorContinuous): return (x-self.mean) / self.std
    def decodes(self, x):
        f = to_cpu if x.device.type=='cpu' else noop
        return (x*f(self.std) + f(self.mean))

tl = TfmdLists(temp, pipe)
dl = tl.dataloaders(bs=8, after_batch=[Norm(axes=0)])

Output

TensorContinuous([[41.7500,  2.2500]], device='cuda:0') TensorContinuous([[12.7811,  1.2817]], device='cuda:0')
dl.one_batch()

Output

ensorContinuous([[-0.1369, -0.1950],
        [-0.8411,  0.5851],
        [-1.5452, -0.9752],
        [-0.2152, -0.1950],
        [ 0.8020, -0.9752],
        [-0.6064, -0.1950],
        [-0.6064,  1.3653],
        [ 1.0367,  1.3653]], device='cuda:0')

Now, this worked fine because I was dealing with ony these dataframe columns. But in my actual pipeline, I’ve ImageBlock and these RegressionBlocks

For that, I’m using getters and due to my SemanticTensors, that too works fine for me (for only single block of data :weary:)

def get_x(x): return f'{path}/{x.image_path}'
def get_age(x): return x.age
def get_parity(x): return x.parity
def get_y(x): return x.category

getters = [get_x, get_age, get_y]

Currently, I’m only working with age (I’ll explain the reason) and the pipeline seems to work fine

def RegressionFBlock():
  return TransformBlock(type_tfms=[RegSetup()], batch_tfms=[NormalizeTfm(axes=0)])

dblock = DataBlock(blocks=(ImageBlock, RegressionFBlock, CategoryBlock),
                   getters=getters,
                   splitter=ColSplitter('is_val'),
                   item_tfms=Resize(size),
                   batch_tfms = [*aug_transforms(max_zoom=0, flip_vert=True)])

The normalize transform I’m using needs to take care of the element I’m working on (Which is huge disadvantage for me)

class NormalizeTfm(Transform):
    "Normalize/denorm batch of `TensorImage`"
    order=99
    def __init__(self, mean=None, std=None, axes=(0,2,3)): self.mean,self.std,self.axes = mean,std,axes

    @classmethod
    def from_stats(cls, mean, std, dim=1, ndim=4, cuda=True): return cls(*broadcast_vec(dim, ndim, mean, std, cuda=cuda))

    def setups(self, dl:DataLoader):
        if self.mean is None or self.std is None:
            _,x,_ = dl.one_batch()
            self.mean,self.std = x.mean(self.axes, keepdim=True),x.std(self.axes, keepdim=True)+1e-7

    def encodes(self, x:TensorContinuous): return (x-self.mean) / self.std
    def decodes(self, x:TensorContinuous):
        f = to_cpu if x.device.type=='cpu' else noop
        return (x*f(self.std) + f(self.mean))

check the setups method of NormalizeTfm. Neverthless, it’s working and the end result of dblock.summary(df) is as follows:

Applying batch_tfms to the batch built
  Pipeline: IntToFloatTensor -> AffineCoordTfm -> LightingTfm -> NormalizeTfm
    starting from
      (TensorImage of size 4x3x224x224, TensorContinuous([43., 43., 43., 43.], device='cuda:0'), TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying IntToFloatTensor gives
      (TensorImage of size 4x3x224x224, TensorContinuous([43., 43., 43., 43.], device='cuda:0'), TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying AffineCoordTfm gives
      (TensorImage of size 4x3x224x224, TensorContinuous([43., 43., 43., 43.], device='cuda:0'), TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying LightingTfm gives
      (TensorImage of size 4x3x224x224, TensorContinuous([43., 43., 43., 43.], device='cuda:0'), TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying NormalizeTfm gives
      (TensorImage of size 4x3x224x224, TensorContinuous([0.0307, 0.0307, 0.0307, 0.0307], device='cuda:0'), TensorCategory([2, 2, 2, 2], device='cuda:0'))

Now the real problem is, how can I deal with multiple columns in the same pipeline as I was able to do using Pipeline and TfmdLists above.

I tried modifying getters to accept tuple/array as:

getters = [get_x, lambda x: (x.age, x.parity), get_y]

But this results in two objects of TensorContinuous and sets up dedicated pipeline for each (Which is not expected in my scenario)

TL;DR

  1. How can I improvise NormalizeTfm to work with my data?
  2. How to pass multiple items from getters to single TransformBlock?
  3. Even if I need to create separate block for each column, how can I normalize that with the respective stats? Otherwise, the only option I’ve is to write yet another NormalizeTfm with modifed setups method for respective column.

@kshitijpatil09 just a thought, why not keep a dictionary of their values and what they mean? (IE age : 0) this way you can easily look up their stats via said dictionary, which we store in the class somewhere (or an array, some look up system based on position)

1 Like

One side note first: I see you have copied the Normalize transform and just added a new encodes/decodes for a new type. You can do this without writing a new class by using a decorator:

@Normalize
def encodes(self, x:TensorContinuous): return (x-self.mean) / self.std

@Normalize
def decodes(self, x:TensorContinuous):
    f = to_cpu if x.device.type=='cpu' else noop
    return (x*f(self.std) + f(self.mean))

Now back to your question. To pass multiple items to a single TransformBlock you need to use a list/L/dictionary but not a tuple. Tuples are special for transforms, and they will try to apply on each part of the tuple instead of taking the tuple as a whole. If you really need tuples, use ItemTransforms in the pipeline that receives them.

With that, you can them write a custom Transform that normalizes each input with the different stats, and define a block that has it as default if you want to use the data block API.

6 Likes

That’s a great idea! Will try to do that.

Thanks! big help.

So I should just write these methods with @normalize decorator and no need to include it in the pipeline? Or I should include Normalize transform and it’ll work with custom type?

I guess I’ve tried using python lists but not the others. Will try to use them and report you back.

One more doubt: When we pass Dataframe to DataBlock, the x we get for getters is one row of it (pd.Series) right?

You’d write it out with the decorators, then just use Normalize in your transforms. It’ll use TypeDispatching to figure out what to use

1 Like

Okay. got it.

Experiment :wink: (try making a getter that just prints out ‘x’)

I did. It’s a pd.Series (just wanted to confirm)

1 Like

Yes, it is :slight_smile:

I did as mentioned, however, the next problem is now with the tensor shape. Original Normalize was supposed to work with images and hence had dim=(0,2,3). In my case, what I’ve is a 1D tensor and it needs dim=0. For instance,

x = torch.randn(8,2)
feat = TensorContinuous(x)

Here ‘8’ is a batch size and 2 for no. of columns. Originally, the mean and std is supposed to work for

feat.mean(0)

To make it work with the default dim value(0,2,3), I tried adding extra dimension but it didn’t work in actual DataBlock:

# TensorContinuous(x)[...,None,None] loses the original class 
# and falls back to Tensor
feat = TensorContinuous(x).unqueeze(-1).unsqueeze(-1).float()

Still it throws following error:

Applying batch_tfms to the batch built
  Pipeline: IntToFloatTensor -> AffineCoordTfm -> LightingTfm -> Normalize
    starting from
      (TensorImage of size 4x3x224x224, TensorContinuous of size 4x2x1x1, TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying IntToFloatTensor gives
      (TensorImage of size 4x3x224x224, TensorContinuous of size 4x2x1x1, TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying AffineCoordTfm gives
      (TensorImage of size 4x3x224x224, TensorContinuous of size 4x2x1x1, TensorCategory([2, 2, 2, 2], device='cuda:0'))
    applying LightingTfm gives
      (TensorImage of size 4x3x224x224, TensorContinuous of size 4x2x1x1, TensorCategory([2, 2, 2, 2], device='cuda:0'))
torch.Size([4, 2, 1, 1])
    applying Normalize failed.

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1

Finally, I guess I’ve abused the decorators :sweat_smile: to somehow get things working:

@Normalize
def setups(self, dl:DataLoader):
  if self.mean is None or self.std is None:
      x,x2,_ = dl.one_batch()
      self.mean,self.std = x.mean(self.axes, keepdim=True),x.std(self.axes, keepdim=True)+1e-7
      self.mean_cont, self.std_cont = x2.mean(0, keepdim=True), x2.std(0, keepdim=True)+1e-7

I used mean_cont and std_cont to normalize and it worked. Output of simple print statement in setups method ->

mean: TensorContinuous([[40.3750,  3.2500]]

Using these stats, I got normalized tensors

w,x,y = dls.one_batch(); x[0]

output:

tensor([-1.1967, -0.7906], device='cuda:0')

Suggest a better way of doing this.

Since you don’t have a subclass of DataLoader specific to your problem, this will end up making Normalize not work on other problems. If you really have custom behavior, don’t hesitate to write your own class, I was just pointing out what was possible :wink:

1 Like

So I’m using slight modified Norm transform and used a different variable mean_cnt instead of mean itself since that might override stats of TensorImage

class Norm(Transform):
    "Normalize/denorm batch of `TensorImage`"
    order=99
    def __init__(self, mean_cnt=None, std_cnt=None, axes=(0,2,3)): store_attr(self,'mean_cnt,std_cnt,axes')

    def setups(self, dl:DataLoader):
        if self.mean_cnt is None or self.std_cnt is None:
            _,x,_ = dl.one_batch()    
            self.mean_cnt,self.std_cnt = x.mean(self.axes, keepdim=True),x.std(self.axes, keepdim=True)+1e-7            

    def encodes(self, x:TensorContinuous): return (x-self.mean_cnt) / self.std_cnt
    def decodes(self, x:TensorContinuous):
        f = to_cpu if x.device.type=='cpu' else noop
        return (x*f(self.std_cnt) + f(self.mean_cnt))

This trick worked for me.

1 Like

I am still struggling here… So when we create our SiamesePair (per the tutorial), we’ve already generated the dataset essentially. From the tutorial, what further needs to be done is combining the two together and then also one-hot encoding the output. I can do this by first changing SiamesePair to give us a y we expect:

class SiamesePair(Transform):
    def __init__(self,items,labels):
        self.items,self.labels,self.assoc = items,labels,self
        sortlbl = sorted(enumerate(labels), key=itemgetter(1))
        # dict of (each unique label) -- (list of indices with that label)
        self.clsmap = {k:L(v).itemgot(0) for k,v in itertools.groupby(sortlbl, key=itemgetter(1))}
        self.idxs = range_of(self.items)
        
    def encodes(self,i):
        "x: tuple of `i`th image and a random image from same or different class; y: True if same class"
        othercls = self.clsmap[self.labels[i]] if random.random()>0.5 else self.idxs
        otherit = random.choice(othercls)
        lbl = self.labels[otherit]==self.labels[i]
        lbl = tensor([0]) if lbl else tensor([1])
        return SiameseImage(self.items[i], self.items[otherit], lbl)

And then make a separate transform, combine_images, which concatenates the two tensors:

def combine_images(items:Tuple):
  "Combines two images"
  return (torch.cat([items[0],items[1]], dim=2), items[2])

So now the pipeline (which will prepare a full dataset) looks like so:

OpenAndResize = Transform(resized_image)
labeller = RegexLabeller(pat = r'/([^/]+)_\d+.jpg$')
sp = SiamesePair(items, items.map(labeller))
pipe = Pipeline([sp, OpenAndResize, combine_images])

Where/how do I delegate how to build everything? Because I can’t just pass in SiamesePair to it, because it expects the items already, and after said items are passed through, the encodes already generate a dataset based on the label of the current i (which is an index into the Dataset, which is not what the normal encodes is (a path))

Sorry I’m asking so many questions, just the tutorial left it at that and I am really struglling putting the Pipeline given into some form of a DataLoader

Or better: What fundamental am I missing here?

You just create a TfmdLists from it:
tls = TfmdLists(range_of(items), pipe) (note that you can replace pipe by your list of transforms
then you can do tls.dataloaders(...).

1 Like

THANK YOU so much! I feel so stupid, the answer is so obvious there. :man_facepalming: That got it right away :slight_smile:

(Also disregard the whole combine_images thing, just realized what we’re actually trying to do here)

2 Likes

I’m also on somewhat similar lines of @muellerzr. I’ve to merge 3 single channel images into one and consider them as masks, so should I go for something like PILImage.merge or do that later in the pipeline by using torch.stack ?

I’d say do a torch.stack unless generating them as a PILImage would make it better to read.