Fastai v2 vision

@sgugger quick bug report (which I know you’ll get to later):

doing the following with a list of file names on an image regression learner gives me the following issue:

dl = learn.dls.test_dl(imgs[:10])
preds = learn.get_preds(dl=dl)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-65-9a993e47d6c2> in <module>()
----> 1 preds = learn.get_preds(dl=dl)

12 frames
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, **kwargs)
    319             self(_before_epoch)
    320             self._do_epoch_validate(dl=dl)
--> 321             self(_after_epoch)
    322             if act is None: act = getattr(self.loss_func, 'activation', noop)
    323             res = cb.all_tensors()

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
    226     def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
    227 
--> 228     def __call__(self, event_name): L(event_name).map(self._call_one)
    229     def _call_one(self, event_name):
    230         assert hasattr(event, event_name)

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _call_one(self, event_name)
    229     def _call_one(self, event_name):
    230         assert hasattr(event, event_name)
--> 231         [cb(event_name) for cb in sort_by_run(self.cbs)]
    232 
    233     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in <listcomp>(.0)
    229     def _call_one(self, event_name):
    230         assert hasattr(event, event_name)
--> 231         [cb(event_name) for cb in sort_by_run(self.cbs)]
    232 
    233     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
     23         _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
     24                (self.run_valid and not getattr(self, 'training', False)))
---> 25         if self.run and _run: getattr(self, event_name, noop)()
     26 
     27     def __setattr__(self, name, value):

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in after_fit(self)
     91         "Concatenate all recorded tensors"
     92         if self.with_input:     self.inputs  = detuplify(to_concat(self.inputs, dim=self.concat_dim))
---> 93         if not self.save_preds: self.preds   = detuplify(to_concat(self.preds, dim=self.concat_dim))
     94         if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
     95         if self.with_loss:      self.losses  = to_concat(self.losses)

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __getattr__(self, k)
    221             attr = getattr(self,self._default,None)
    222             if attr is not None: return getattr(attr, k)
--> 223         raise AttributeError(k)
    224     def __dir__(self): return custom_dir(self, self._dir() if self._xtra is None else self._dir())
    225 #     def __getstate__(self): return self.__dict__

AttributeError: preds

Calling a show_batch on the DataLoader works

(Also using git version of fastai2 and fastcore)

(By the way, for these issues would it be easier to open it up on GitHub since you are all on a book push??)

That’s a good idea.

1 Like

Got it :slight_smile: Focus on that book :wink:

You’ve added a parameter “normalize” to cnn_learner that allows to easily include specific backbone/arch imagenet normalisation without adding it to batch_tfms in the DataBlock.dataloaders.

Would be a reason not to add it to unet_learner too? @sgugger

I did but forgot to push. It should be there now.

2 Likes

Thanks!

@sgugger (when you have time later) I’m running into a bit of a conundrum. My transform (which acts on the tail end of absolutely everything else), needs the image size we transformed into. How do I access this if my transform has a 99. IE we have the following:

#export
class TensorMaps(Transform):
  "Convert points to heatmaps"
  order = 99
  def __init__(self, sigma=1): self.sigma=sigma


  def encodes(self, x:TensorPoint): 
    x = (x+1)*224/2
    return _generate_maps(x, self.sigma, (224,224))

The issue is that 224 is the size of my transformed image. What would be the better way to go about getting the transformed images size (so I don’t need to hard-code it in)! Should I make the incoming x both a TensorPoint and a TensorImage so this way if it finds a pair of them it can extract the transformed dimensions? Thanks!

1 Like

I’m trying to modify the DataBlock API (or at least a part of it) so that I can have a 4 channel image input (now that the unet_learner allows for multichannel inputs).

The main thing I’m doing here is to create a new sort of PILImage named “MultiChannelImage”.
When calling:

b = dls.one_batch()
b[0].shape, b[0].dtype, b[1].shape, b[1].dtype

i get:

(torch.Size([4, 4, 512, 512]), torch.uint8, torch.Size([4, 512, 512]), torch.int64)

It seems that the IntToFloatTensor transform from the ImageBlock is not applied so I get a uint8 batch x input.

Can you tell what I’m missing and more generally if I didn’t complicate the things? @sgugger

dim = 512

def _cam_lbl(x):
    return get_mask(x)

def _cam_fn(x):
    mask_embeeding = get_mask_embedding(x)
    
    img = np.array(PIL.Image.open(x).convert('RGB'))

    # construct the 4 channel image
    new_img = np.zeros(img.shape[:2] + (4, ), dtype=img.dtype)
    new_img[..., :3] = img
    new_img[..., 3] = mask_embeeding
    
    return new_img


class MultiChannelImage(PILBase):
    def show(self, ctx=None, **kwargs):
        img = image2tensor(self)[:3]
        return show_image(img, ctx=ctx, **merge(self._show_args, kwargs))


class TensorMultiChannelImage(TensorImageBase):
    def show(self, ctx=None, **kwargs):
        return show_image(self[0], ctx=ctx, **{**self._show_args, **kwargs})

    
MultiChannelImage  ._tensor_cls = TensorMultiChannelImage
MultiChannelImage  . create     = Transform(MultiChannelImage.create)

@ToTensor
def encodes(self, o:MultiChannelImage): return o._tensor_cls(image2tensor(o))

camvid = DataBlock(blocks=(ImageBlock(cls=MultiChannelImage), MaskBlock),
                   item_tfms=[Resize(dim)],
                   get_x=_cam_fn,
                   get_y=_cam_lbl,
                   splitter=RandomSplitter())

dls = camvid.dataloaders(source=items, bs=4, num_workers=16,  device=default_device())

I don’t see you defining anywhere that IntToFloat tensor is being done. Can you try including it as a batch_tfms and then doing a camvid.summary() to see if it’s being done at all? (if it does twice then it’s being done, if not you need to include it). To use the summary you’ll need the dev install and pass in the path

The definition of ImageBlock is the following:

def ImageBlock(cls=PILImage): return TransformBlock(type_tfms=cls.create, batch_tfms=IntToFloatTensor)

and I supposed that ImageBlock(cls=MultiChannelImage) will still benefit from the “batch_tfms=IntToFloatTensor”.

I’ll try what you said anyway and be back with the answer right away.

But more generaly, isnt there an easier way to have a multichannel input with DataBlocks?

1 Like

Possibly. Consider this: PILImage, PILImageBW, and PILMask are all doing the same thing. Their difference:
PILImage - 3 channel
PILImageBW - 2 channel
PILMask - 1 channel.

What changes is their create functions.

Should I include IntToFloat to batch_tfms parameter of DataBlock contructor or camvid.dataloaders method? Or is it the same?

In any case here is the summary:
Setting-up type transforms pipelines
Collecting items from [‘1446367-1_ind_1_box_103_568_1516_1663.jpg’]
Found 1 items
2 datasets of sizes 1,0
Setting up Pipeline: _cam_fn -> PILBase.create
Setting up Pipeline: _cam_lbl -> PILBase.create

Building one sample
  Pipeline: _cam_fn -> PILBase.create
    starting from
      1446367-1_ind_1_box_103_568_1516_1663.jpg
    applying _cam_fn gives
      [[[255 255 255   0]
  [255 255 255   0]
  [255 255 255   0]
  ...
  [255 255 255   0]
  [255 255 255   0]
  [255 255 255   0]]

 
  ...
  [255 255 255   0]
  [255 255 255   0]
  [255 255 255   0]]]
    applying PILBase.create gives
      <__main__.MultiChannelImage image mode=RGBA size=1413x1095 at 0x7F27706302B0>
  Pipeline: _cam_lbl -> PILBase.create
    starting from
      1446367-1_ind_1_box_103_568_1516_1663.jpg
    applying _cam_lbl gives
      [[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]
    applying PILBase.create gives
      <fastai2.vision.core.PILMask image mode=L size=1413x1095 at 0x7F2770630320>

Final sample: (<__main__.MultiChannelImage image mode=RGBA size=1413x1095 at 0x7F2770630908>, <fastai2.vision.core.PILMask image mode=L size=1413x1095 at 0x7F27706308D0>)


Setting up after_item: Pipeline: AddMaskCodes -> Resize -> ToTensor
Setting up before_batch: Pipeline: 
Setting up after_batch: Pipeline: IntToFloatTensor

Building one batch
Applying item_tfms to the first sample:
  Pipeline: AddMaskCodes -> Resize -> ToTensor
    starting from
      (<__main__.MultiChannelImage image mode=RGBA size=1413x1095 at 0x7F2770630588>, <fastai2.vision.core.PILMask image mode=L size=1413x1095 at 0x7F2770630668>)
    applying AddMaskCodes gives
      (<__main__.MultiChannelImage image mode=RGBA size=1413x1095 at 0x7F2770630588>, <fastai2.vision.core.PILMask image mode=L size=1413x1095 at 0x7F2770630668>)
    applying Resize gives
      (<__main__.MultiChannelImage image mode=RGBA size=512x512 at 0x7F2770630898>, <fastai2.vision.core.PILMask image mode=L size=512x512 at 0x7F2770630438>)
    applying ToTensor gives
      (TensorMultiChannelImage of size 4x512x512, TensorMask of size 512x512)

Adding the next 3 samples

The only way to make it work seems to be adding “.float()” to this:

@ToTensor
def encodes(self, o:MultiChannelImage): return o._tensor_cls(image2tensor(o).FLOAT()).

But i still doesnt understand why should i do this.

I’d add it to the DataBlock's batch_tfms. But we can also see that we have IntToFloatTensor being applied here (look at setting up after_batch)

Yes, I’ve seen it. But if I remove “.float()” from the “encodes” definition it doesn’t work anymore and i get an uint8 tensor

I think I found a plausible reason. The IntToFloatTensor is applied only to o:TensorImage and o:TensorMask. And not my TensorMultiChannelImage implementation…

1 Like

TypeDelegates! That would do it! :slight_smile:

Help! :slight_smile:

Would delegates help to add a new method:
def encodes(self, TensorMultiChannelImage): return BLAH
to IntToFloatTensor?

Make that inherit from TensorImage ?