Binary segmentation - loss function and metrics to get CamVid type unet to work?

LessW2020 · March 2, 2020, 5:41am

I’m trying to get the CamVid style unet segmentation to work for a binary segmentation and it’s proving very, ,very difficult…
I can run the camvid example with no problem and thus, clearly something is amiss with either my loss function or similar.

1 - I am able to show a batch, confirm my x and y match shapes via one_batch():
‘’'x,y = dls.one_batch()

x.shape
torch.Size([2, 3, 171, 228])

y.shape
torch.Size([2, 171, 228])
‘’’

Yet trying to actually fit I either:

A - get a CUDA out of memory if I use the camvid_accuracy metric, adusted to remove void since I don’t have a void class…this is BS2 on a V100 16GB that runs the camvid examples with no issue and my images are smaller…

or

B - if I change to just metrics=accuracy, get an assertion error that target and input don’t match. The target seems correct (bs =2 * 171*228 = 77,796…yet it has an input of 684?? )

Thus, anyone know what is the correct metric and/or loss function to use for binary to adjust the camvid example?
Thanks!

muellerzr · March 2, 2020, 5:43am

@LessW2020 what is your batch size? (mind is foggy right now on what I’m trying to figure out by reading this )

LessW2020 · March 2, 2020, 5:44am

bs=2 just to try and get it to work …

LessW2020 · March 2, 2020, 5:45am

let me post my notebook…spent way too long fighting just to get a simple binary segmentation to work today

muellerzr · March 2, 2020, 5:47am

Can you post that full message it gives you? (the stack trace)

LessW2020 · March 2, 2020, 5:49am

Sure - here it is - not sure why ‘’’ and ‘’’ enclosure doesn’t force it into code mode?

‘’'epoch train_loss valid_loss accuracy time
0 0.663077 0.484067 None 00:03

AssertionError Traceback (most recent call last)
in
----> 1 learn.fit_one_cycle(5, slice(lr), pct_start=0.9, wd=1e-2)

~/fastai2/fastai2/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
110 scheds = {‘lr’: combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
111 ‘mom’: combined_cos(pct_start, *(self.moms if moms is None else moms))}
–> 112 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
113
114 # Cell

~/fastai2/fastai2/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
175 self.epoch=epoch; self(‘begin_epoch’)
176 self._do_epoch_train()
–> 177 self._do_epoch_validate()
178 except CancelEpochException: self(‘after_cancel_epoch’)
179 finally: self(‘after_epoch’)

~/fastai2/fastai2/fastai2/learner.py in _do_epoch_validate(self, ds_idx, dl)
157 dl,old,has = change_attrs(dl, names, [False,False])
158 self.dl = dl; self(‘begin_validate’)
–> 159 with torch.no_grad(): self.all_batches()
160 except CancelValidException: self(‘after_cancel_validate’)
161 finally:

~/fastai2/fastai2/fastai2/learner.py in all_batches(self)
125 def all_batches(self):
126 self.n_iter = len(self.dl)
–> 127 for o in enumerate(self.dl): self.one_batch(*o)
128
129 def one_batch(self, i, b):

~/fastai2/fastai2/fastai2/learner.py in one_batch(self, i, b)
139 self.opt.zero_grad()
140 except CancelBatchException: self(‘after_cancel_batch’)
–> 141 finally: self(‘after_batch’)
142
143 def _do_begin_fit(self, n_epoch):

~/fastai2/fastai2/fastai2/learner.py in call(self, event_name)
106 def ordered_cbs(self, cb_func): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
107
–> 108 def call(self, event_name): L(event_name).map(self._call_one)
109 def _call_one(self, event_name):
110 assert hasattr(event, event_name)

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
360 else f.format if isinstance(f,str)
361 else f.getitem)
–> 362 return self._new(map(g, self))
363
364 def filter(self, f, negate=False, **kwargs):

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
313 @property
314 def _xtra(self): return None
–> 315 def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
316 def getitem(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
317 def copy(self): return self._new(self.items.copy())

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in call(cls, x, args, **kwargs)
39 return x
40
—> 41 res = super().call(((x,) + args), **kwargs)
42 res._newchk = 0
43 return res

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in init(self, items, use_list, match, *rest)
304 if items is None: items = []
305 if (use_list is not None) or not _is_array(items):
–> 306 items = list(items) if use_list else _listify(items)
307 if match is not None:
308 if is_coll(match): match = len(match)

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in _listify(o)
240 if isinstance(o, list): return o
241 if isinstance(o, str) or _is_array(o): return [o]
–> 242 if is_iter(o): return list(o)
243 return [o]
244

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in call(self, *args, **kwargs)
206 if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
207 fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
–> 208 return self.fn(*fargs, **kwargs)
209
210 # Cell

~/fastai2/fastai2/fastai2/learner.py in _call_one(self, event_name)
109 def _call_one(self, event_name):
110 assert hasattr(event, event_name)
–> 111 [cb(event_name) for cb in sort_by_run(self.cbs)]
112
113 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/fastai2/fastai2/fastai2/learner.py in (.0)
109 def _call_one(self, event_name):
110 assert hasattr(event, event_name)
–> 111 [cb(event_name) for cb in sort_by_run(self.cbs)]
112
113 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/fastai2/fastai2/fastai2/callback/core.py in call(self, event_name)
21 _run = (event_name not in _inner_loop or (self.run_train and getattr(self, ‘training’, True)) or
22 (self.run_valid and not getattr(self, ‘training’, False)))
—> 23 if self.run and _run: getattr(self, event_name, noop)()
24 if event_name==‘after_fit’: self.run=True #Reset self.run to True at each end of fit
25

~/fastai2/fastai2/fastai2/learner.py in after_batch(self)
387 if len(self.yb) == 0: return
388 mets = self._train_mets if self.training else self._valid_mets
–> 389 for met in mets: met.accumulate(self.learn)
390 if not self.training: return
391 self.lrs.append(self.opt.hypers[-1][‘lr’])

~/fastai2/fastai2/fastai2/learner.py in accumulate(self, learn)
323 def accumulate(self, learn):
324 bs = find_bs(learn.yb)
–> 325 self.total += to_detach(self.func(learn.pred, *learn.yb))*bs
326 self.count += bs
327 @property

~/fastai2/fastai2/fastai2/metrics.py in accuracy(inp, targ, axis)
73 def accuracy(inp, targ, axis=-1):
74 “Compute accuracy with targ when pred is bs * n_classes”
—> 75 pred,targ = flatten_check(inp.argmax(dim=axis), targ)
76 return (pred == targ).float().mean()
77

~/fastai2/fastai2/fastai2/torch_core.py in flatten_check(inp, targ)
763 “Check that out and targ have the same number of elements and flatten them.”
764 inp,targ = inp.contiguous().view(-1),targ.contiguous().view(-1)
–> 765 test_eq(len(inp), len(targ))
766 return inp,targ

~/anaconda3/lib/python3.7/site-packages/fastcore/test.py in test_eq(a, b)
30 def test_eq(a,b):
31 "test that a==b"
—> 32 test(a,b,equals, ‘==’)
33
34 # Cell

~/anaconda3/lib/python3.7/site-packages/fastcore/test.py in test(a, b, cmp, cname)
20 "assert that cmp(a,b); display inputs and cname or cmp.__name__ if it fails"
21 if cname is None: cname=cmp.name
—> 22 assert cmp(a,b),f"{cname}:\n{a}\n{b}"
23
24 # Cell

AssertionError: ==:
684
77976

‘’’

muellerzr · March 2, 2020, 5:51am

it’s " ``` " (also be sure to hit enter after the last tick before adding code)

LessW2020 · March 2, 2020, 5:51am

and backing out a bit higher = here is the learner creation:

learn = unet_learner(dls, resnet34, loss_func=CrossEntropyLossFlat(axis=1), opt_func=opt_func, 
path=path, metrics=accuracy, #acc_camvid,
                     config = unet_config(norm_type=None), wd_bn_bias=True)

aviopene · May 29, 2020, 1:45pm

You say you’re having a hard time? Watch my losses…

And this is the best result I’m having in three days of work, full of non-grayscale masks, cuda device-side asserts triggered and so forth. Segmentation is still an obscure and dangerous process to me.

Binary segmentation - loss function and metrics to get CamVid type unet to work?

‘’'epoch train_loss valid_loss accuracy time 0 0.663077 0.484067 None 00:03

‘’'epoch train_loss valid_loss accuracy time
0 0.663077 0.484067 None 00:03