Ah missed that. Thank you! (Duh, makes sense given DataBlock.summary())
Ha, thatās amazing. This is like reading from the future. Just ran into the same thing.
But now I am facing even bigger issue. I got my model to start training and after some fights with CUDA, I managed to get it to this stage:
st = DataBlock(blocks=(ImageBlock, ImageBlock(cls=PILMask)),
splitter=RandomSplitter(),
get_items=get_image_files,
item_tfms=RandomResizedCrop(256),
get_y=lambda o: str(o).replace(
'_standard_','_coco_').replace('standard','label').replace('jpg','png'))
dls = st.dataloaders(path, bs=4,
batch_tfms=[*aug_transforms(size=256,
max_warp=0),
Normalize.from_stats(*imagenet_stats)])
lrnr = unet_learner(dls, resnet50, config=unet_config(self_attention=True),
n_out=6, loss_func=seg_accuracy)
Where:
lrnr.y.shape
torch.Size([4, 256, 256])
lrnr.x.shape
torch.Size([4, 3, 256, 256])
But then I get this issue:
lrnr.fit_one_cycle(10,3e-4)
epoch train_loss valid_loss time
0 0.176048 00:06
0.54% [1/185 01:39<5:06:20 0.1760]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-15-ac44c0b0efe6> in <module>
----> 1 lrnr.fit_one_cycle(10,3e-4)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
88 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
89 'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
---> 90 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
91
92 # Cell
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
287 try:
288 self.epoch=epoch; self('begin_epoch')
--> 289 self._do_epoch_train()
290 self._do_epoch_validate()
291 except CancelEpochException: self('after_cancel_epoch')
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _do_epoch_train(self)
262 try:
263 self.dl = self.dls.train; self('begin_train')
--> 264 self.all_batches()
265 except CancelTrainException: self('after_cancel_train')
266 finally: self('after_train')
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in all_batches(self)
240 def all_batches(self):
241 self.n_iter = len(self.dl)
--> 242 for o in enumerate(self.dl): self.one_batch(*o)
243
244 def one_batch(self, i, b):
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
250 self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')
251 if not self.training: return
--> 252 self.loss.backward(); self('after_backward')
253 self.opt.step(); self('after_step')
254 self.opt.zero_grad()
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/torch_core.py in _f(self, *args, **kwargs)
270 def _f(self, *args, **kwargs):
271 cls = self.__class__
--> 272 res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
273 return retain_type(res, self)
274 return _f
~/daisy-gan/venv/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
148 products. Defaults to ``False``.
149 """
--> 150 torch.autograd.backward(self, gradient, retain_graph, create_graph)
151
152 def register_hook(self, hook):
~/daisy-gan/venv/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
97 Variable._execution_engine.run_backward(
98 tensors, grad_tensors, retain_graph, create_graph,
---> 99 allow_unreachable=True) # allow_unreachable flag
100
101
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I suspect that has something to do with the custom loss_func, but I included it to get around the CUDA issue. Though the CAMVID test case runs without an issue, so I am still investigating what the difference is.
I am just so confused. I have made my example follow so closely the only thing they differ on now is that they are loading different images. But I can now make the CAMVID example error out.
Can anyone replicate the example below? You only need to change the CAMVID path to fit your system.
import torchvision as tv
import torch; torch.__version__, torch.__file__
from src.dataset_builder import load_dataset_from_address
from utils.misc_utils import load_h5py
from PIL import Image
import numpy as np
import matplotlib.pylab as plt
# %pylab inline
from fastai2.basics import *
from fastai2.callback.all import *
from fastai2.vision.all import *
def seg_accuracy(input, target):
target = target.squeeze(1)
return (input.argmax(dim=1)==target).float().mean()
path = '/home/jakub/.fastai/data/camvid/images/'
dls = SegmentationDataLoaders.from_label_func(path, bs=1,
fnames = get_image_files(path),
item_tfms=RandomResizedCrop(256),
# label_func = lambda o: str(o).replace(
# '_standard_','_coco_').replace('standard','label').replace('jpg','png'),
label_func = lambda o: str(o).replace('images','labels').replace('.png','_P.png'),
codes = np.loadtxt('/home/jakub/.fastai/data/camvid/codes.txt', dtype=str),
batch_tfms=[*aug_transforms(size=(360,480)), Normalize.from_stats(*imagenet_stats)])
# +
# codes = np.loadtxt('st_codes.txt', dtype=str)
codes = np.loadtxt('/home/jakub/.fastai/data/camvid/codes.txt', dtype=str)
dls.vocab = codes
name2id = {v:k for k,v in enumerate(codes)}
void_code = name2id['Void']
def acc_camvid(input, target):
target = target.squeeze(1)
mask = target != void_code
return (input.argmax(dim=1)[mask]==target[mask]).float().mean()
# -
dls.show_batch(max_n=2, rows=1, vmin=1, vmax=30, figsize=(20, 7))
# +
opt_func = partial(Adam, lr=3e-3, wd=0.01)#, eps=1e-8)
learn = unet_learner(dls, resnet34, loss_func=CrossEntropyLossFlat(axis=1), opt_func=opt_func, path=path,
metrics=acc_camvid, n_out=6,
config = unet_config(norm_type=None, self_attention=True), wd_bn_bias=True)
# +
# lrnr = unet_learner(dls, resnet50, config=unet_config(self_attention=True),
# n_out=6, loss_func=seg_accuracy)
# -
learn.fit_one_cycle(10,3e-4)
This yields the following:
epoch train_loss valid_loss acc_camvid time
0 0.000000 00:02
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
251 if not self.training: return
--> 252 self.loss.backward(); self('after_backward')
253 self.opt.step(); self('after_step')
~/daisy-gan/venv/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
149 """
--> 150 torch.autograd.backward(self, gradient, retain_graph, create_graph)
151
~/daisy-gan/venv/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
98 tensors, grad_tensors, retain_graph, create_graph,
---> 99 allow_unreachable=True) # allow_unreachable flag
100
RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
<ipython-input-12-474116ee9487> in <module>
----> 1 learn.fit_one_cycle(10,3e-4)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
88 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
89 'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
---> 90 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
91
92 # Cell
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
287 try:
288 self.epoch=epoch; self('begin_epoch')
--> 289 self._do_epoch_train()
290 self._do_epoch_validate()
291 except CancelEpochException: self('after_cancel_epoch')
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _do_epoch_train(self)
262 try:
263 self.dl = self.dls.train; self('begin_train')
--> 264 self.all_batches()
265 except CancelTrainException: self('after_cancel_train')
266 finally: self('after_train')
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in all_batches(self)
240 def all_batches(self):
241 self.n_iter = len(self.dl)
--> 242 for o in enumerate(self.dl): self.one_batch(*o)
243
244 def one_batch(self, i, b):
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in one_batch(self, i, b)
254 self.opt.zero_grad()
255 except CancelBatchException: self('after_cancel_batch')
--> 256 finally: self('after_batch')
257
258 def _do_begin_fit(self, n_epoch):
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
221 def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
222
--> 223 def __call__(self, event_name): L(event_name).map(self._call_one)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
360 else f.format if isinstance(f,str)
361 else f.__getitem__)
--> 362 return self._new(map(g, self))
363
364 def filter(self, f, negate=False, **kwargs):
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
313 @property
314 def _xtra(self): return None
--> 315 def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
316 def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
317 def copy(self): return self._new(self.items.copy())
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
39 return x
40
---> 41 res = super().__call__(*((x,) + args), **kwargs)
42 res._newchk = 0
43 return res
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
304 if items is None: items = []
305 if (use_list is not None) or not _is_array(items):
--> 306 items = list(items) if use_list else _listify(items)
307 if match is not None:
308 if is_coll(match): match = len(match)
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in _listify(o)
240 if isinstance(o, list): return o
241 if isinstance(o, str) or _is_array(o): return [o]
--> 242 if is_iter(o): return list(o)
243 return [o]
244
~/daisy-gan/venv/lib/python3.6/site-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
206 if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
207 fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208 return self.fn(*fargs, **kwargs)
209
210 # Cell
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in _call_one(self, event_name)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
--> 226 [cb(event_name) for cb in sort_by_run(self.cbs)]
227
228 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in <listcomp>(.0)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
--> 226 [cb(event_name) for cb in sort_by_run(self.cbs)]
227
228 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in __call__(self, event_name)
23 _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
24 (self.run_valid and not getattr(self, 'training', False)))
---> 25 if self.run and _run: getattr(self, event_name, noop)()
26
27 @property
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in after_batch(self)
495 if len(self.yb) == 0: return
496 mets = self._train_mets if self.training else self._valid_mets
--> 497 for met in mets: met.accumulate(self.learn)
498 if not self.training: return
499 self.lrs.append(self.opt.hypers[-1]['lr'])
~/daisy-gan/venv/lib/python3.6/site-packages/fastai2/learner.py in accumulate(self, learn)
458 def accumulate(self, learn):
459 self.count += 1
--> 460 self.val = torch.lerp(to_detach(learn.loss.mean(), gather=False), self.val, self.beta)
461 @property
462 def value(self): return self.val/(1-self.beta**self.count)
RuntimeError: CUDA error: device-side assert triggered
Thatās due to your mask labels not being equal to all the possible pixel classes present, not an API issue. Are you sure your codes align with the dataset present? If so, add one more category for āotherā
Wow amazing! Thanks I knew it must have been something simple. Really appreciate your help!
@sgugger I actually have a question about that, how would someone go about putting a test there to raise an issue if this is a thing (during dbunch
generation) as I know this is a very common issue with segmentation. Perhaps it could be done on your DataBlock.summary()
? (Iām unsure if it checks for this right now)
I can look at this when I have time (not for a bit though )
I know you may not have gotten to it yet, but show_results
is currently broken for an object detection model:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
216 # in this case we return a big list
--> 217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
TypeError: expected Tensor as element 0 in argument 0, but got int
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
22 frames
<ipython-input-32-c3b657dcc9ae> in <module>()
----> 1 learn.show_results()
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in show_results(self, ds_idx, dl, max_n, shuffle, **kwargs)
332 if dl is None: dl = self.dls[ds_idx].new(shuffle=shuffle)
333 b = dl.one_batch()
--> 334 _,_,preds = self.get_preds(dl=[b], with_decoded=True)
335 self.dls.show_results(b, preds, max_n=max_n, **kwargs)
336
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, **kwargs)
313 self(_before_epoch)
314 self._do_epoch_validate(ds_idx, dl)
--> 315 self(_after_epoch)
316 if act is None: act = getattr(self.loss_func, 'activation', noop)
317 res = cb.all_tensors()
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
221 def ordered_cbs(self, cb_func:str): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, cb_func)]
222
--> 223 def __call__(self, event_name): L(event_name).map(self._call_one)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
360 else f.format if isinstance(f,str)
361 else f.__getitem__)
--> 362 return self._new(map(g, self))
363
364 def filter(self, f, negate=False, **kwargs):
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
313 @property
314 def _xtra(self): return None
--> 315 def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
316 def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
317 def copy(self): return self._new(self.items.copy())
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
39 return x
40
---> 41 res = super().__call__(*((x,) + args), **kwargs)
42 res._newchk = 0
43 return res
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
304 if items is None: items = []
305 if (use_list is not None) or not _is_array(items):
--> 306 items = list(items) if use_list else _listify(items)
307 if match is not None:
308 if is_coll(match): match = len(match)
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in _listify(o)
240 if isinstance(o, list): return o
241 if isinstance(o, str) or _is_array(o): return [o]
--> 242 if is_iter(o): return list(o)
243 return [o]
244
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
206 if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
207 fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208 return self.fn(*fargs, **kwargs)
209
210 # Cell
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _call_one(self, event_name)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
--> 226 [cb(event_name) for cb in sort_by_run(self.cbs)]
227
228 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in <listcomp>(.0)
224 def _call_one(self, event_name):
225 assert hasattr(event, event_name)
--> 226 [cb(event_name) for cb in sort_by_run(self.cbs)]
227
228 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in __call__(self, event_name)
23 _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
24 (self.run_valid and not getattr(self, 'training', False)))
---> 25 if self.run and _run: getattr(self, event_name, noop)()
26
27 @property
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in after_fit(self)
86 "Concatenate all recorded tensors"
87 if self.with_input: self.inputs = detuplify(to_concat(self.inputs, dim=self.concat_dim))
---> 88 if not self.save_preds: self.preds = detuplify(to_concat(self.preds, dim=self.concat_dim))
89 if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
90 if self.with_loss: self.losses = to_concat(self.losses)
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
211 def to_concat(xs, dim=0):
212 "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213 if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
214 if isinstance(xs[0],dict): return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs.keys()}
215 #We may receives xs that are not concatenatable (inputs of a text classifier for instance),
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in to_concat(xs, dim)
217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219 for i in range_of(o_)) for o_ in xs], L())
220
221 # Cell
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in <listcomp>(.0)
217 try: return retain_type(torch.cat(xs, dim=dim), xs[0])
218 except: return sum([L(retain_type(o_.index_select(dim, tensor(i)).squeeze(dim), xs[0])
--> 219 for i in range_of(o_)) for o_ in xs], L())
220
221 # Cell
/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in range_of(x)
160 def range_of(x):
161 "All indices of collection `x` (i.e. `list(range(len(x)))`)"
--> 162 return list(range(len(x)))
163
164 # Cell
TypeError: object of type 'int' has no len()
It strongly depends on how your model returns its output too. I don;t have any examples of show_results with multi-target, so there might be something broken in fastai2.
Got it. Iāll take a look at that and see. Itās the RetinaNet architecture used in previous lectures. IIRC I had a seperate function thatās as used to show the results. Iāll get that working then Maybe we can get some inspiration on how to fit it in
I implemented the ānon-pretrainedā version and am now working on the pretrained version.
I just wanted to check if for n_in>3, additional weights should be 0 as I understand they wonāt ever learn anything.
They will still have gradients, so they will learn
Oops⦠make sense!
I just sent a PR.
Maybe this helps:
Quick question, should our points that come out of PointScaler
be on a scale of -1,1? Or their own scale? Because the second is what is currently happening. On my particular dataset, when I manually calculated what the y
range was:
tfmd_pnts = [dls.after_item.point_scaler(x[1]) for x in dls.dataset]
min_pnt = 0
max_pnt = 0
for t in tfmd_pnts:
if t.min() < min_pnt:
min_pnt = t.min()
if t.max() > max_pnt:
max_pnt = t.max()
max_pnt = float(max_pnt)
min_pnt = float(min_pnt)
I do not get -1,1 I get -2.0491, 3.5536 for my dataset. (And I verified no point went off the imageās range by accident). This is important as sometimes when running a multi-point regression model, (without explicitly declaring a y_range
), the points will all stack to the middle. Could this be due to the face that Resize
occurs after PointScaler
? And it should in fact be the other way around so everything scales to the new image size that it should be?
I do think this is the problem, Currently PointScaler
is being applied before Resize
, this means that _scale_pnts
will be calculated with the wrong (pre-resized) sz
, generating this problem
def _scale_pnts(y, sz, do_scale=True, y_first=False):
if y_first: y = y.flip(1)
res = y * 2/tensor(sz).float() - 1 if do_scale else y
return TensorPoint(res, img_size=sz)
I did tried to fix the issue, by putting PointScaler
after Resize
, but that just generates another problem, Resize
starts to fail with TensorPoint
, the root of this issue is this line inside Resize.encodes
: orig_sz = _get_sz(x)
_get_sz
returns an empty tuple because no img_size
_meta
was attributed to TensorPoint
yet.
Iām trying for some hours now to attribute img_size
to TensorPoint
before Resize.encodes
gets called, but I failed every time
Yes PointScaler is called before the resizing, but that shouldnāt impact anything: points will be resized with respect to the actual size of the original image. Or are you passing coordinates assuming the image has already been resized?
No, theyāre just the original coordinates for the original sizes. The documentation says everything is on a scale of -1,1 but this isnāt what Iām seeing. So is that not true? Or is it variable depending on the starting image size and ending. Thanks @sgugger
Double check what are the points sent, but if they are indeed inside the images, they should have coordinates between -1 and 1. If anything, having Resize happen before PointScaler would be the reason you see wrong coordinates.
A quick example is the following problem:
Original image size: (1025, 721)
New image size: (224,224)
Point: (1024,720)
If we follow how it is being done, IE
newX = 1024 * 2/224 - 1
We get: 8.14 as our point. I think it should actually be taking the original size not the transformed for that to work properly and then apply it. (This leads to a 0.99, which is what is expected) which I donāt think is whatās being done, else we wouldnāt see -2 ish as happening on my DataLoader above