Fastai v2 vision

muellerzr · January 28, 2020, 4:08am

Ah missed that. Thank you! (Duh, makes sense given DataBlock.summary())

leaf · January 28, 2020, 4:06pm

Ha, that’s amazing. This is like reading from the future. Just ran into the same thing.

But now I am facing even bigger issue. I got my model to start training and after some fights with CUDA, I managed to get it to this stage:

st = DataBlock(blocks=(ImageBlock, ImageBlock(cls=PILMask)),
              splitter=RandomSplitter(),
              get_items=get_image_files,
              item_tfms=RandomResizedCrop(256),
              get_y=lambda o: str(o).replace(
                  '_standard_','_coco_').replace('standard','label').replace('jpg','png')) 

dls = st.dataloaders(path, bs=4,
                     batch_tfms=[*aug_transforms(size=256,
                                                       max_warp=0), 
                                 Normalize.from_stats(*imagenet_stats)])
lrnr = unet_learner(dls, resnet50, config=unet_config(self_attention=True), 
                    n_out=6, loss_func=seg_accuracy)

Where:

lrnr.y.shape
torch.Size([4, 256, 256])
lrnr.x.shape
torch.Size([4, 3, 256, 256])

But then I get this issue:

lrnr.fit_one_cycle(10,3e-4)

epoch	train_loss	valid_loss	time
0	0.176048	00:06
 0.54% [1/185 01:39<5:06:20 0.1760]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-ac44c0b0efe6> in <module>
----> 1 lrnr.fit_one_cycle(10,3e-4)

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
     88     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
     89               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
---> 90     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
     91 
     92 # Cell

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    287                     try:
    288                         self.epoch=epoch;          self('begin_epoch')
--> 289                         self._do_epoch_train()
    290                         self._do_epoch_validate()
    291                     except CancelEpochException:   self('after_cancel_epoch')

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _do_epoch_train(self)
    262         try:
    263             self.dl = self.dls.train;                  self('begin_train')
--> 264             self.all_batches()
    265         except CancelTrainException:                         self('after_cancel_train')
    266         finally:                                             self('after_train')

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in all_batches(self)
    240     def all_batches(self):
    241         self.n_iter = len(self.dl)
--> 242         for o in enumerate(self.dl): self.one_batch(*o)
    243 
    244     def one_batch(self, i, b):

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
    250             self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')
    251             if not self.training: return
--> 252             self.loss.backward();                            self('after_backward')
    253             self.opt.step();                                 self('after_step')
    254             self.opt.zero_grad()

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/torch_core.py in _f(self, *args, **kwargs)
    270         def _f(self, *args, **kwargs):
    271             cls = self.__class__
--> 272             res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
    273             return retain_type(res, self)
    274         return _f

~/daisy-gan/venv/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    148                 products. Defaults to ``False``.
    149         """
--> 150         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    151 
    152     def register_hook(self, hook):

~/daisy-gan/venv/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     97     Variable._execution_engine.run_backward(
     98         tensors, grad_tensors, retain_graph, create_graph,
---> 99         allow_unreachable=True)  # allow_unreachable flag
    100 
    101 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I suspect that has something to do with the custom loss_func, but I included it to get around the CUDA issue. Though the CAMVID test case runs without an issue, so I am still investigating what the difference is.

leaf · January 28, 2020, 6:58pm

I am just so confused. I have made my example follow so closely the only thing they differ on now is that they are loading different images. But I can now make the CAMVID example error out.

Can anyone replicate the example below? You only need to change the CAMVID path to fit your system.


import torchvision as tv
import torch; torch.__version__, torch.__file__

from src.dataset_builder import load_dataset_from_address
from utils.misc_utils import load_h5py
from PIL import Image
import numpy as np
import matplotlib.pylab as plt
# %pylab inline

from fastai2.basics import *
from fastai2.callback.all import *
from fastai2.vision.all import *


def seg_accuracy(input, target):
    target = target.squeeze(1)
    return (input.argmax(dim=1)==target).float().mean()


path = '/home/jakub/.fastai/data/camvid/images/'

dls = SegmentationDataLoaders.from_label_func(path, bs=1,
    fnames = get_image_files(path), 
    item_tfms=RandomResizedCrop(256),
#     label_func = lambda o: str(o).replace(
#                   '_standard_','_coco_').replace('standard','label').replace('jpg','png'),
    label_func = lambda o: str(o).replace('images','labels').replace('.png','_P.png'),
    codes = np.loadtxt('/home/jakub/.fastai/data/camvid/codes.txt', dtype=str),                         
    batch_tfms=[*aug_transforms(size=(360,480)), Normalize.from_stats(*imagenet_stats)])

# +
# codes = np.loadtxt('st_codes.txt', dtype=str)
codes = np.loadtxt('/home/jakub/.fastai/data/camvid/codes.txt', dtype=str)
dls.vocab = codes
name2id = {v:k for k,v in enumerate(codes)}

void_code = name2id['Void']

def acc_camvid(input, target):
    target = target.squeeze(1)
    mask = target != void_code
    return (input.argmax(dim=1)[mask]==target[mask]).float().mean()


# -

dls.show_batch(max_n=2, rows=1, vmin=1, vmax=30, figsize=(20, 7))

# +
opt_func = partial(Adam, lr=3e-3, wd=0.01)#, eps=1e-8)

learn = unet_learner(dls, resnet34, loss_func=CrossEntropyLossFlat(axis=1), opt_func=opt_func, path=path, 
                     metrics=acc_camvid, n_out=6, 
                     config = unet_config(norm_type=None, self_attention=True), wd_bn_bias=True)

# +
# lrnr = unet_learner(dls, resnet50, config=unet_config(self_attention=True), 
#                     n_out=6, loss_func=seg_accuracy)
# -

learn.fit_one_cycle(10,3e-4)

This yields the following:

epoch	train_loss	valid_loss	acc_camvid	time
0	0.000000	00:02
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
    251             if not self.training: return
--> 252             self.loss.backward();                            self('after_backward')
    253             self.opt.step();                                 self('after_step')

~/daisy-gan/venv/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    149         """
--> 150         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    151 

~/daisy-gan/venv/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     98         tensors, grad_tensors, retain_graph, create_graph,
---> 99         allow_unreachable=True)  # allow_unreachable flag
    100 

RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-12-474116ee9487> in <module>
----> 1 learn.fit_one_cycle(10,3e-4)

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
     88     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
     89               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
---> 90     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
     91 
     92 # Cell

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    287                     try:
    288                         self.epoch=epoch;          self('begin_epoch')
--> 289                         self._do_epoch_train()
    290                         self._do_epoch_validate()
    291                     except CancelEpochException:   self('after_cancel_epoch')

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _do_epoch_train(self)
    262         try:
    263             self.dl = self.dls.train;                  self('begin_train')
--> 264             self.all_batches()
    265         except CancelTrainException:                         self('after_cancel_train')
    266         finally:                                             self('after_train')

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in all_batches(self)
    240     def all_batches(self):
    241         self.n_iter = len(self.dl)
--> 242         for o in enumerate(self.dl): self.one_batch(*o)
    243 
    244     def one_batch(self, i, b):

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
    254             self.opt.zero_grad()
    255         except CancelBatchException:                         self('after_cancel_batch')
--> 256         finally:                                             self('after_batch')
    257 
    258     def _do_begin_fit(self, n_epoch):

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
    221     def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
    222 
--> 223     def __call__(self, event_name): L(event_name).map(self._call_one)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _call_one(self, event_name)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in <listcomp>(.0)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
     23         _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
     24                (self.run_valid and not getattr(self, 'training', False)))
---> 25         if self.run and _run: getattr(self, event_name, noop)()
     26 
     27     @property

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in after_batch(self)
    495         if len(self.yb) == 0: return
    496         mets = self._train_mets if self.training else self._valid_mets
--> 497         for met in mets: met.accumulate(self.learn)
    498         if not self.training: return
    499         self.lrs.append(self.opt.hypers[-1]['lr'])

~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in accumulate(self, learn)
    458     def accumulate(self, learn):
    459         self.count += 1
--> 460         self.val = torch.lerp(to_detach(learn.loss.mean(), gather=False), self.val, self.beta)
    461     @property
    462     def value(self): return self.val/(1-self.beta**self.count)

RuntimeError: CUDA error: device-side assert triggered

muellerzr · January 28, 2020, 6:59pm

That’s due to your mask labels not being equal to all the possible pixel classes present, not an API issue. Are you sure your codes align with the dataset present? If so, add one more category for ‘other’

leaf · January 28, 2020, 7:24pm

Wow amazing! Thanks I knew it must have been something simple. Really appreciate your help!

muellerzr · January 28, 2020, 8:37pm

@sgugger I actually have a question about that, how would someone go about putting a test there to raise an issue if this is a thing (during dbunch generation) as I know this is a very common issue with segmentation. Perhaps it could be done on your DataBlock.summary()? (I’m unsure if it checks for this right now)

sgugger · January 28, 2020, 9:36pm

I can look at this when I have time (not for a bit though )

muellerzr · January 30, 2020, 7:24pm

I know you may not have gotten to it yet, but show_results is currently broken for an object detection model:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
    216     #   in this case we return a big list
--> 217     try:    return retain_type(torch.cat(xs, dim=dim), xs[0])
    218     except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])

TypeError: expected Tensor as element 0 in argument 0, but got int

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
22 frames
<ipython-input-32-c3b657dcc9ae> in <module>()
----> 1 learn.show_results()

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in show_results(self, ds_idx, dl, max_n, shuffle, **kwargs)
    332         if dl is None: dl = self.dls[ds_idx].new(shuffle=shuffle)
    333         b = dl.one_batch()
--> 334         _,_,preds = self.get_preds(dl=[b], with_decoded=True)
    335         self.dls.show_results(b, preds, max_n=max_n, **kwargs)
    336 

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, **kwargs)
    313             self(_before_epoch)
    314             self._do_epoch_validate(ds_idx, dl)
--> 315             self(_after_epoch)
    316             if act is None: act = getattr(self.loss_func, 'activation', noop)
    317             res = cb.all_tensors()

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
    221     def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
    222 
--> 223     def __call__(self, event_name): L(event_name).map(self._call_one)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _call_one(self, event_name)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in <listcomp>(.0)
    224     def _call_one(self, event_name):
    225         assert hasattr(event, event_name)
--> 226         [cb(event_name) for cb in sort_by_run(self.cbs)]
    227 
    228     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
     23         _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
     24                (self.run_valid and not getattr(self, 'training', False)))
---> 25         if self.run and _run: getattr(self, event_name, noop)()
     26 
     27     @property

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in after_fit(self)
     86         "Concatenate all recorded tensors"
     87         if self.with_input:     self.inputs  = detuplify(to_concat(self.inputs, dim=self.concat_dim))
---> 88         if not self.save_preds: self.preds   = detuplify(to_concat(self.preds, dim=self.concat_dim))
     89         if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
     90         if self.with_loss:      self.losses  = to_concat(self.losses)

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
    217     try:    return retain_type(torch.cat(xs, dim=dim), xs[0])
    218     except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219                           for i in range_of(o_)) for o_ in xs], L())
    220 
    221 # Cell

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
    217     try:    return retain_type(torch.cat(xs, dim=dim), xs[0])
    218     except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219                           for i in range_of(o_)) for o_ in xs], L())
    220 
    221 # Cell

/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in range_of(x)
    160 def range_of(x):
    161     "All indices of collection `x` (i.e. `list(range(len(x)))`)"
--> 162     return list(range(len(x)))
    163 
    164 # Cell

TypeError: object of type 'int' has no len()

sgugger · January 30, 2020, 7:34pm

It strongly depends on how your model returns its output too. I don;t have any examples of show_results with multi-target, so there might be something broken in fastai2.

muellerzr · January 30, 2020, 7:37pm

Got it. I’ll take a look at that and see. It’s the RetinaNet architecture used in previous lectures. IIRC I had a seperate function that’s as used to show the results. I’ll get that working then Maybe we can get some inspiration on how to fit it in

boris · January 31, 2020, 3:15am

I implemented the “non-pretrained” version and am now working on the pretrained version.
I just wanted to check if for n_in>3, additional weights should be 0 as I understand they won’t ever learn anything.

jeremy · January 31, 2020, 4:16am

They will still have gradients, so they will learn

boris · January 31, 2020, 6:11am

Oops… make sense!
I just sent a PR.

ptrampert · January 31, 2020, 8:05am

Maybe this helps:

muellerzr · February 4, 2020, 12:17am

Quick question, should our points that come out of PointScaler be on a scale of -1,1? Or their own scale? Because the second is what is currently happening. On my particular dataset, when I manually calculated what the y range was:

tfmd_pnts = [dls.after_item.point_scaler(x[1]) for x in dls.dataset]
min_pnt = 0
max_pnt = 0
for t in tfmd_pnts:
  if t.min() < min_pnt:
    min_pnt = t.min()
  if t.max() > max_pnt:
    max_pnt = t.max()
max_pnt = float(max_pnt)
min_pnt = float(min_pnt)

I do not get -1,1 I get -2.0491, 3.5536 for my dataset. (And I verified no point went off the image’s range by accident). This is important as sometimes when running a multi-point regression model, (without explicitly declaring a y_range), the points will all stack to the middle. Could this be due to the face that Resize occurs after PointScaler? And it should in fact be the other way around so everything scales to the new image size that it should be?

lgvaz · February 4, 2020, 12:23am

I do think this is the problem, Currently PointScaler is being applied before Resize, this means that _scale_pnts will be calculated with the wrong (pre-resized) sz, generating this problem

def _scale_pnts(y, sz, do_scale=True, y_first=False):
    if y_first: y = y.flip(1)
    res = y * 2/tensor(sz).float() - 1 if do_scale else y
    return TensorPoint(res, img_size=sz)

I did tried to fix the issue, by putting PointScaler after Resize, but that just generates another problem, Resize starts to fail with TensorPoint, the root of this issue is this line inside Resize.encodes: orig_sz = _get_sz(x)
_get_sz returns an empty tuple because no img_size _meta was attributed to TensorPoint yet.

I’m trying for some hours now to attribute img_size to TensorPoint before Resize.encodes gets called, but I failed every time

sgugger · February 4, 2020, 2:45pm

Yes PointScaler is called before the resizing, but that shouldn’t impact anything: points will be resized with respect to the actual size of the original image. Or are you passing coordinates assuming the image has already been resized?

muellerzr · February 4, 2020, 2:48pm

No, they’re just the original coordinates for the original sizes. The documentation says everything is on a scale of -1,1 but this isn’t what I’m seeing. So is that not true? Or is it variable depending on the starting image size and ending. Thanks @sgugger

sgugger · February 4, 2020, 2:52pm

Double check what are the points sent, but if they are indeed inside the images, they should have coordinates between -1 and 1. If anything, having Resize happen before PointScaler would be the reason you see wrong coordinates.

muellerzr · February 4, 2020, 2:57pm

A quick example is the following problem:

Original image size: (1025, 721)
New image size: (224,224)
Point: (1024,720)

If we follow how it is being done, IE
newX = 1024 * 2/224 - 1

We get: 8.14 as our point. I think it should actually be taking the original size not the transformed for that to work properly and then apply it. (This leads to a 0.99, which is what is expected) which I don’t think is what’s being done, else we wouldn’t see -2 ish as happening on my DataLoader above