Using albumentations with fastai v2

WaterKnight · April 4, 2020, 4:36pm

Nice, thank you!!! I will try later, I will let you know if there is some kind of problem!

WaterKnight · April 6, 2020, 9:13am

I have tried the code @sgugger , summary is working, however show_batch shows the next error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-3028df996ec0> in <module>
     11 dataset1.summary(path_images)
     12 dls = dataset1.dataloaders(path_images,bs=bs)
---> 13 dls.show_batch(vmin=0,vmax=1,figsize=(12, 9))

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/core.py in show_batch(self, b, max_n, ctxs, show, unique, **kwargs)
     95             old_get_idxs = self.get_idxs
     96             self.get_idxs = lambda: Inf.zeros
---> 97         if b is None: b = self.one_batch()
     98         if not show: return self._pre_show_batch(b, max_n=max_n)
     99         show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in one_batch(self)
    130     def one_batch(self):
    131         if self.n is not None and len(self)==0: raise ValueError(f'This DataLoader does not contain any batches')
--> 132         with self.fake_l.no_multiproc(): res = first(self)
    133         if hasattr(self, 'it'): delattr(self, 'it')
    134         return res

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastcore/utils.py in first(x)
    175 def first(x):
    176     "First element of `x`, or None if missing"
--> 177     try: return next(iter(x))
    178     except StopIteration: return None
    179 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in __iter__(self)
     96         self.randomize()
     97         self.before_iter()
---> 98         for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
     99             if self.device is not None: b = to_device(b, self.device)
    100             yield self.after_batch(b)

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    343 
    344     def __next__(self):
--> 345         data = self._next_data()
    346         self._num_yielded += 1
    347         if self._dataset_kind == _DatasetKind.Iterable and \

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    383     def _next_data(self):
    384         index = self._next_index()  # may raise StopIteration
--> 385         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    386         if self._pin_memory:
    387             data = _utils.pin_memory.pin_memory(data)

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
     32                 raise StopIteration
     33         else:
---> 34             data = next(self.dataset_iter)
     35         return self.collate_fn(data)
     36 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in create_batches(self, samps)
    105         self.it = iter(self.dataset) if self.dataset is not None else None
    106         res = filter(lambda o:o is not None, map(self.do_item, samps))
--> 107         yield from map(self.do_batch, self.chunkify(res))
    108 
    109     def new(self, dataset=None, cls=None, **kwargs):

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in do_batch(self, b)
    126     def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
    127     def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
--> 128     def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
    129     def to(self, device): self.device = device
    130     def one_batch(self):

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in create_batch(self, b)
    125     def retain(self, res, b):  return retain_types(res, b[0] if is_listy(b) else b)
    126     def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
--> 127     def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
    128     def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
    129     def to(self, device): self.device = device

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in fa_collate(t)
     44     b = t[0]
     45     return (default_collate(t) if isinstance(b, _collate_types)
---> 46             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     47             else default_collate(t))
     48 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in <listcomp>(.0)
     44     b = t[0]
     45     return (default_collate(t) if isinstance(b, _collate_types)
---> 46             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
     47             else default_collate(t))
     48 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/load.py in fa_collate(t)


     45     return (default_collate(t) if isinstance(b, _collate_types)
     46             else type(t[0])([fa_collate(s) for s in zip(*t)]) if isinstance(b, Sequence)
---> 47             else default_collate(t))
     48 
     49 # Cell

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     79         return [default_collate(samples) for samples in transposed]
     80 
---> 81     raise TypeError(default_collate_err_msg_format.format(elem_type))

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'fastai2.vision.core.PILImage'>

Creating a learner with that Dataset gives the exact same error!

Looks like validation set is not getting the AddMasks, Resize, ToTensor and Normalize!

sgugger · April 6, 2020, 1:21pm

Please stop that behavior where you reply, re-reply, tag, re-tag in an effort to attract attention to your post. I have millions of things to do, two kids at home to homeschool and I don’t have time for those distractions. The only thing you got with this is that I now ignore your messages and if you keep going I will just block you.

Be patient, try to debug yourself, and wait for people in the community to reply. Never tag someone in particular unless there is only that person that can help you. This is not the case here.

WaterKnight · April 6, 2020, 1:39pm

I tagged you because you updated the tutorial and said that the error in Fastcore ItemTransform was solved.

There is no documentation of how split_idx work and no explanation in fastbook so it’s is difficult to understand that.

Sorry for trying to get attention, I am used to see people doing that in forums that doesn’t have the option to ask for attention. If it is bad behaviour in this forum ,I’ll stop doing it!

muellerzr · April 6, 2020, 1:52pm

If you look split_idx is from fastcore. The first hint is that it wasn’t in the fastai2 documentation, it was only referenced. The two libraries go hand in hand. From the fastcore documentation:

Filtering based on the dataset type - By setting the split_idx flag you can make the transform be used only in a specific DataSource subset like in training, but not validation.

Further down:

f the transform has split_idx then it’s only applied if split_idx param matches.

With this example:

f.split_idx = 1
test_eq(f(1, split_idx=1),2)
test_eq_type(f(1, split_idx=0), 1)

F is then as follows (from the documentation):

def func(x): return Int(x+1)
def dec (x): return x-1
f = Transform(func,dec)
t = f(1)

So we can see that if we have a split_idx of 1 it’ll be applied (and the result is 2) and when it is 0 it’s not (and the result is 1)

This was taken under “Main Transform features”:

sgugger · April 6, 2020, 3:10pm

You replied first and I had seen your reply. It’s just that I will look into it when I have time. No need to add the mention in an edit later.

WaterKnight · April 6, 2020, 4:13pm

I was not trying to bother you much less, take a look when you have time.

If I have bothered you, I apologize

WaterKnight · April 6, 2020, 4:15pm

Okey.

So, training has index 0 and validation has index 1?

muellerzr · April 6, 2020, 4:15pm

Correct!

WaterKnight · April 6, 2020, 4:17pm

Interesting

So, I don’t understand why I am getting an error if I use split_idx=0 on AlbumentationsSegmentationWrapper.

sgugger · April 6, 2020, 5:00pm

I have introduced a bug in fastcore while fixing another. Working on a fix now, and adding more robust tests.

WaterKnight · April 6, 2020, 5:05pm

Good to know!

sgugger · April 6, 2020, 5:39pm

Fixed now, and added enough tests to avoid regression.
There is also an example at the end of the pets tutorial.

WaterKnight · April 6, 2020, 5:41pm

Thank you very much!!

I’ll look into the example.

Sorry for my bad behaviour before.

sgugger · April 6, 2020, 5:43pm

No problem, just remember that one reply is enough, and not tagging is necessary

WaterKnight · April 6, 2020, 5:57pm

Works with DataBlock API or just on after_item of Datasets.dataloaders()?

sgugger · April 6, 2020, 5:58pm

It should work with the data block API normally.

WaterKnight · April 6, 2020, 7:39pm

I have make some models and I can confirm that is working correctly now. Thank you!!!