Fastai v2 vision

In fact I’ve just realised that I dont want to have IntToFloatTensor applied to my TensorMultiChannelImage because the first 3 channels correspond to a RGB image (0 <-> 255) and the 4th to a mask (0 <-> number of labels). So I’m defining a custom IntToFloatTensor.

@ToTensor
def encodes(self, o:MultiChannelImage):
t = image2tensor(o).float()
t[:3] = t[:3].div_(255)
t[3:] = t[3:].div_(3)

return o._tensor_cls(t)

Question: Is it possible to integrate this to the TensorMultiChannelImage or MultiChannelImage class?

I’m wondering how and if the Normalize.from_stats(*imagenet_stats) is applied. It should be applied to the first 3 channels only.

Should I do this normalisation into my custom “encodes” too?

Or just simply override the IntToFloatTensor and Normalize transforms and add handling of x:TensorMultiChannelImage.

There is a “bit” of mess in my head now. But I think overriding existing transforms would be better.

I ran into the same situation and just created methods for my personal use case.

@IntToFloatTensor
def encodes…
def decodes…

For each type, just inherit from the most similar one but be careful on the transform that will be applied to it. The best is to make encoding/decoding methods for your specific project on each transform.

2 Likes

I finally went to implement Custom IntToFloatTensor and Normalize transforms that inherit from the base classes and that suit well to my input. I override the encodes/decodes méthods. They are kind of generic so I might do a pull request. We provide a function delegate in the constructor that generates the image and mask(s) and the transforms concatenate them automatically.

The next experience would be to replace the mask with some pixel embedding mask (a sort of mixed input: image + text/word embedding corresponding to each pixel - i’m doing a lot of document semantic segmentation) and see if the same trick works withour modifying the actual unet_learner.

I am stuck on this for several days so I thought I’d post it here.

I find there is something odd with the memory management after finishing an epoch. I tried to run Fastai U-Net on the Synthia dataset and this is an error I get despite having a batchsize 1 with a crop to 360 on an 11GB K80. I must be missing something, because camvid runs fine: (also does not seem to happen if I only load 10 examples to the dataset)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-637f3e7802b9> in <module>()
----> 1 learn.fit_one_cycle(10,slice(1e-6,1e-3), cbs=WandbCallback())

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
     88     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
     89               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
---> 90     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
     91 
     92 # Cell

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    290                         self._do_epoch_validate()
    291                     except CancelEpochException:   self('after_cancel_epoch')
--> 292                     finally:                       self('after_epoch')
    293 
    294             except CancelFitException:             self('after_cancel_fit')

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
    221     def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
    222 
--> 223     def __call__(self, event_name): L(event_name).map(self._call_one)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in _call_one(self, event_name)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in <listcomp>(.0)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
     23         _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
     24                (self.run_valid and not getattr(self, 'training', False)))
---> 25         if self.run and _run: getattr(self, event_name, noop)()
     26 
     27     @property

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/callback/wandb.py in after_epoch(self)
     64         if self.log_preds:
     65             b = self.valid_dl.one_batch()
---> 66             self.learn.one_batch(0, b)
     67             preds = getattr(self.loss_func, 'activation', noop)(self.pred)
     68             out = getattr(self.loss_func, 'decodes', noop)(preds)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
    246         try:
    247             self._split(b);                                  self('begin_batch')
--> 248             self.pred = self.model(*self.xb);                self('after_pred')
    249             if len(self.yb) == 0: return
    250             self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/layers.py in forward(self, x)
    415         for l in self.layers:
    416             res.orig = x
--> 417             nres = l(res)
    418             # We have to remove res.orig to avoid hanging refs and therefore memory leaks
    419             res.orig = None

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/vision/models/unet.py in forward(self, up_in)
     38         if ssh != up_out.shape[-2:]:
     39             up_out = F.interpolate(up_out, s.shape[-2:], mode='nearest')
---> 40         cat_x = self.relu(torch.cat([up_out, self.bn(s)], dim=1))
     41         return self.conv2(self.conv1(cat_x))
     42 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/activation.py in forward(self, input)
     92 
     93     def forward(self, input):
---> 94         return F.relu(input, inplace=self.inplace)
     95 
     96     def extra_repr(self):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/functional.py in relu(input, inplace)
    911         result = torch.relu_(input)
    912     else:
--> 913         result = torch.relu(input)
    914     return result
    915 

RuntimeError: CUDA out of memory. Tried to allocate 508.00 MiB (GPU 0; 11.17 GiB total capacity; 10.43 GiB already allocated; 4.81 MiB free; 419.63 MiB cached)

I was so far unable to replicate this on Camvid. But I will keep trying. The odd thing is that on Camvid (essentially the same dataset), with bs=8 and much larger image size, the % MEM used is ~50.

Further interesting aspects:

  • This is despite the fact that synthia takes ~80% GPU MEM with much more limited batchsize & res.
  • One potential culprit that I am investigating now is the # of classes, as synthia does not seem to run with fewer than 24 n_out, but theoretically the dataset should have 15 classes.
  • WandbCallback() seems to have a substantial GPU MEM footprint. I’ve been noticing it before but now it has a substantial impact (this is wandb problem tho as it happens even w/o fastai from CLI)

Can you try without WandbCallback and also setting WandbCallback(log_preds=False)?

I had the same issue where the callback uses too much memory, partly referenced here though it also mentions am issue specifically related to TfmdLists: https://github.com/fastai/fastai2/issues/70

Oh, it has the old stack trace. Now I am running it without WandbCallback. But not yet 100% sure if that will work.

EDIT: Ok, update: it seems to be working without the WandbCallback.

Augmentation issue. I’m creating the following dataloader:

tfms = [[_cam_fn, PILImage.create], 
        [_cam_lbl, PILMask.create]]

dsets = Datasets(items, tfms, splits=RandomSplitter(seed=2020)(items))
dls = dsets.dataloaders(bs=4, source=items, num_workers=16, 
                        after_item = [Resize(dim), ToTensor()],
                        after_batch=[IntToFloatTensor(), Brightness(), Contrast()])

dls.show_batch(max_n=1, vmin=0, vmax=10, figsize=(12, 12))

Is it normal to have rotate/crop augmentations when showing a batch?

Yes, show_batch shows you the augmented versions so you can see if you did too little/too much.

I’ve just found the issue. It was related to the Resize default method (which is crop). I’m doing documents segmentation and I prefer to have the squish resize so that the prediction mask correspond to the entire document and not to a part of it only.

I’m trying to get the transformed size of my image in a TensorPoint object. This is my transform:

#export
class TensorMaps(Transform):
  "Convert points into heatmaps"
  order = 99
  def __init__(self, sig=1): self.sig=sig

  def encodes(self, x:TensorPoint):
    x = np.array(x.get_meta('img_size'))
    x = (x+1)*x/2
    return _generate_targs(x, self.sig, x.get_meta('img_size'))

However when I try to use that meta on a dbunch, I get “Nonetype” for x when I attempt the (x+1) in encodes. I have it set up to be an item_tfms in my transform block, which should ideally happen just before getting everything collated to a databunch. Is there a different way I should be grabbing it? (Also the block itself has TensorPoint type transforms and a PointScaler item transform, so it should be encoded in yes?)

I also tried doing it in a batch instead but also got None for x’s get_meta

I think I found the solution. It needs to be a before_batch similar to bbox_pad, so we have:

def tensormap(samples):
  def _f(img, pnt):
    size = np.array(img.size())
    hmp = generate_targs(pnt, 1, size[1:])
    return img, hmp
  return [_f(*s) for s in samples]

@sgugger by the way: is it useful to do crops/resize (with crop method) augmentations when doing segmentation? How can we later use the model to predict a mask for a full image? My inputs are of various sizes and aspect ratios and i’ve always did squish resizing (256x256, 512x512, etc.) for segmentation without asking myself if this is the best thing to do.

Any idea/suggestion ?

1 Like

Of course. Only I was surprised to see crops in the augmentations and didn’t know where they came from… From the resizing…

Did you check that, with just the PointScaler, the TensorPoint you get does have that metadata?
Otherwise, the before_batch solution works, it’s just not as elegant :wink:

1 Like

Will check for that, but I have a related question. If I want to use a pipeline on the GPU, like so:

pipe = Pipeline([PILImage.create, ToTensor(), Normalize.from_stats(*imagenet_stats, cuda=True)])

How do I ensure that it’s using the GPU? I’ll get a runtime error because it expected a CPU instead. And I can’t see where to pass in a device. Or should I change the device on the input image after being passed into the pipeline?

Also I’ll have to rethink my approach as the model isn’t quite training how I expect it to be. I’m going to try loading in their pytorch dataloaders and then moving down from there

Normalize is a batch transform, ToTensor is an item transform and PILImage,create is a type transform, so you can’t really apply this pipeline as is.
The device you want to use is specified in your DataLoader init.

2 Likes

So a better way to do this (because I want to just apply this to a single image where I do zero resizing) would be to instead make either a DataBlock or a DataLoader from scratch and pass in the single image? (what I am currently doing is running it raw, IE

t_img = pipe('filename')
with torch.no_grad():
   res = learn.model(t_img.cuda())
1 Like

A DataBlock could work as well, since you can give it any source you like. That source can be a list of one filename.

Continuing on the multichannel image topic, I work with both multispectral satellite images (4-13 channels) and hyperspectral aerial images (+300 channels). One of the main challenges for this is that we can’t use Pillow because it doesn’t support geotiff (.tif) format or more than 4 channels (unless we use sequences of Image objects). Also hyperspectral images are often processed with 3D-CNNs. In fastai v1, I’ve done it like this.

I’m guessing something like the following is a good start to rewrite the previous version to work with fastai2. I’m also toying with the idea of having something like Sentinel2Image (for Sentinel-2 images obviously) and even Sentinel1Image (Dual-pol SAR) -functionalities, so if anyone else has the same ideas I’d be happy to be involved.

import rasterio as rio
from fastai2.basics import *
from fastai2.vision.all import *

def open_npy(fn, dims=2, chans=None):
    im = torch.from_numpy(np.load(fn))
    if dims == 3: im = im[None]
    if chans is not None: im = im[chans]
    return im

def open_geotiff(fn, dims=2, chans=None):
    with rio.open(fn) as f:
        data = f.read()
        data = data.astype(np.float32)
    im = torch.from_numpy(data)
    if chans is not None: im = im[chans]
    if dims == 3: im = im[None]
    return im

class MultiChannelTensorImage(TensorImage):
    _show_args = ArrayImageBase._show_args
    def show(self, channels, ctx=None, **kwargs):
        return show_composite(self, channels=channels, **{**self._show_args, **kwargs})

def show_composite(img, channels, ax=None, figsize=(3,3), 
                   hide_axis=False, **kwargs)->plt.Axes:
    "Show three channel composite so that channels correspond to R, G and B"
    if ax is None: fig, ax = plt.subplots(figsize=figsize)    
    r, g, b = channels
    tempim = img.data.cpu().numpy()
    if len(tempim.shape) == 4: tempim = tempim[0]
    im = np.zeros((tempim.shape[1], tempim.shape[2], 3))
    im[...,0] = tempim[r]
    im[...,1] = tempim[g]
    im[...,2] = tempim[b]
    ax.imshow(norm(im))
    if hide_axis: ax.axis('off')
    return ax

def norm(vals):
    # Just to show clearer images
    return (vals - vals.min())/(vals.max()-vals.min())
6 Likes

How should I deal with multimodal data? I’ve to pass image and text data as input and would build a model based on both the inputs. Since I’ve two continues values from dataframe and image path is also there as the first column, I’m trying to do something like this:

def _multimodal_items(x): 
  print(x)
  return(f'{path}/TR/{x[0]}',x[1],x[2],x[3])

multimodal = DataBlock(blocks=[ImageBlock, RegressionBlock, RegressionBlock, CategoryBlock],
                                    get_items=_multimodal_items,
                                    n_inp=3,
                                    item_tfms=Resize(size),
                                    batch_tfms = [*aug_transforms(max_zoom=0, flip_vert=True)])

My dataframe looks like this:

Looking for quick response. Thanks!

2 Likes