Prediction of a scalar with a CNN

jamesp · February 7, 2019, 7:35pm

Oh, do you have to call it? (RMSE() instead of RMSE)?

jwuphysics · February 7, 2019, 7:38pm

No, oddly enough, there are separate functions for RMSE and root_mean_squared_error. The latter can be used as the convnet loss function:

learn = create_cnn(data, arch=SimpleCNN, pretrained=False, loss_func=root_mean_squared_error)

But doing

learn = create_cnn(data, arch=SimpleCNN, pretrained=False, loss_func=RMSE)

will fail.

I’m pretty puzzled by this behavior, but perhaps somebody more experienced with the codebase can provide a good explanation of the two functions’ differences.

sgugger · February 7, 2019, 9:09pm

Be careful, RMSE is a class that is intended to compute RMSE as a metric (it’s going to accumulate all the predictions/targets before computing it since we can’t do it batch by batch then average), not a loss function. Using root_mean_squared_error should work, as you noticed.

jamesp · February 8, 2019, 3:59pm

How can I help with the documentation to make that more clear? Or is it already clear and the two of us missed it? I’m looking at https://docs.fast.ai/metrics.html and it seems that root_mean_squared_error and RMSE are both in the metrics section.

sgugger · February 8, 2019, 4:11pm

Any metric that is a class is in fact a Callback (as explained a little bit further ahead in the docs). A nice introduction in this section would probably be useful to explain the difference between the two.

xjdeng · March 1, 2019, 4:01am

I’m not sure if it’s due to the changes in the code after these months but I’m getting errors running the following:

data = ImageDataBunch.from_df("./", df=df2, valid_pct = 0.2, size=224,\
                              ds_tfms=get_transforms(), num_workers=0,)\
.normalize(imagenet_stats).label_from_df(cols=1, label_cls=FloatList)

class MSELossFlat(nn.MSELoss): 
#“Same as `nn.MSELoss`, but flattens input and target.”
    def forward(self, input:Tensor, target:Tensor) -> Rank0Tensor:
        return super().forward(input.view(-1), target.view(-1))

learn = create_cnn(data, models.resnet34)

The Error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-82-f600890a6412> in <module>
      1 
----> 2 learn = create_cnn(data, models.resnet34)
      3 learn.loss = MSELossFlat

d:\Anaconda3\lib\site-packages\fastai\vision\learner.py in create_cnn(data, arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, **learn_kwargs)
     75     head = custom_head or create_head(nf, data.c, lin_ftrs, ps=ps, bn_final=bn_final)
     76     model = nn.Sequential(body, head)
---> 77     learn = Learner(data, model, **learn_kwargs)
     78     learn.split(ifnone(split_on,meta['split']))
     79     if pretrained: learn.freeze()

<string> in __init__(self, data, model, opt_func, loss_func, metrics, true_wd, bn_wd, wd, train_bn, path, model_dir, callback_fns, callbacks, layer_groups)

d:\Anaconda3\lib\site-packages\fastai\basic_train.py in __post_init__(self)
    151         self.path = Path(ifnone(self.path, self.data.path))
    152         (self.path/self.model_dir).mkdir(parents=True, exist_ok=True)
--> 153         self.model = self.model.to(self.data.device)
    154         self.loss_func = ifnone(self.loss_func, self.data.loss_func)
    155         self.metrics=listify(self.metrics)

d:\Anaconda3\lib\site-packages\fastai\data_block.py in __getattr__(self, k)
    586         res = getattr(y, k, None)
    587         if res is not None: return res
--> 588         raise AttributeError(k)
    589 
    590     def __setstate__(self,data:Any): self.__dict__.update(data)

AttributeError: device

I’ve tried adding the following line before create_cnn():

data.device = torch.device('cpu')

And ran the following:

data.device = torch.device('cpu')
learn = create_cnn(data, models.resnet34)
learn.loss = MSELossFlat
learn.fit_one_cycle(4)

But now it gives me a new error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-89-495233eaf2b4> in <module>
----> 1 learn.fit_one_cycle(4)

d:\Anaconda3\lib\site-packages\fastai\train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, tot_epochs, start_epoch)
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start, tot_epochs=tot_epochs, 
     21                                        start_epoch=start_epoch))
---> 22     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     23 
     24 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, wd:float=None):

d:\Anaconda3\lib\site-packages\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
    174         if not getattr(self, 'opt', False): self.create_opt(lr, wd)
    175         else: self.opt.lr,self.opt.wd = lr,wd
--> 176         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    177         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
    178             callbacks=self.callbacks+callbacks)

d:\Anaconda3\lib\site-packages\fastai\basic_train.py in <listcomp>(.0)
    174         if not getattr(self, 'opt', False): self.create_opt(lr, wd)
    175         else: self.opt.lr,self.opt.wd = lr,wd
--> 176         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    177         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
    178             callbacks=self.callbacks+callbacks)

d:\Anaconda3\lib\site-packages\fastai\basic_train.py in __init__(self, learn)
    382         super().__init__(learn)
    383         self.opt = self.learn.opt
--> 384         self.train_dl = self.learn.data.train_dl
    385         self.no_val,self.silent = False,False
    386 

d:\Anaconda3\lib\site-packages\fastai\data_block.py in __getattr__(self, k)
    586         res = getattr(y, k, None)
    587         if res is not None: return res
--> 588         raise AttributeError(k)
    589 
    590     def __setstate__(self,data:Any): self.__dict__.update(data)

AttributeError: train_dl

Any ideas? If necessary, I’ll provide a dataset for you to play with to hopefully reproduce this (but it’s getting late for me.)

Also, here are my specs:

Windows 10
Python 3.6
Pytorch 1.0.0
Fastai ver 1.0.45

sgugger · March 1, 2019, 2:11pm

You’re mixing a bit of data block API (final label_from_df) with a factory method. That can’t work, you have to pick which API you want to use

xjdeng · March 2, 2019, 3:12am

Ok, am I still mixing them this time around?

data = (ImageDataBunch.from_df("./", df=df2, num_workers=0,)\
.label_from_df(cols=1, label_cls=FloatList)
.transform(tfms = get_transforms(), size=224))

data.normalize(imagenet_stats)

But I’m still getting an error:

d:\Anaconda3\lib\site-packages\fastai\basic_data.py:260: UserWarning: It's not possible to collate samples of your dataset together in a batch.
Shapes of the inputs/targets:
[[torch.Size([3, 58, 58]), torch.Size([3, 25, 25]), torch.Size([3, 92, 92]), torch.Size([3, 99, 99]), torch.Size([3, 102, 102]), torch.Size([3, 92, 92]), torch.Size([3, 103, 103]), torch.Size([3, 52, 52]), torch.Size([3, 142, 142]), torch.Size([3, 133, 133]), torch.Size([3, 91, 91]), torch.Size([3, 70, 70]), torch.Size([3, 73, 73]), torch.Size([3, 99, 99]), torch.Size([3, 102, 102]), torch.Size([3, 108, 108]), torch.Size([3, 44, 44]), torch.Size([3, 22, 22]), torch.Size([3, 76, 76]), torch.Size([3, 104, 104]), torch.Size([3, 84, 84]), torch.Size([3, 74, 74]), torch.Size([3, 141, 141]), torch.Size([3, 56, 56]), torch.Size([3, 105, 105]), torch.Size([3, 81, 81]), torch.Size([3, 60, 60]), torch.Size([3, 93, 93]), torch.Size([3, 62, 62]), torch.Size([3, 61, 61]), torch.Size([3, 97, 97]), torch.Size([3, 90, 90]), torch.Size([3, 79, 79]), torch.Size([3, 94, 94]), torch.Size([3, 81, 81]), torch.Size([3, 24, 24]), torch.Size([3, 104, 104]), torch.Size([3, 121, 121]), torch.Size([3, 39, 39]), torch.Size([3, 94, 94]), torch.Size([3, 117, 117]), torch.Size([3, 43, 43]), torch.Size([3, 25, 25]), torch.Size([3, 100, 100]), torch.Size([3, 91, 91]), torch.Size([3, 124, 124]), torch.Size([3, 88, 88]), torch.Size([3, 72, 72]), torch.Size([3, 27, 27]), torch.Size([3, 98, 98]), torch.Size([3, 80, 80]), torch.Size([3, 41, 41]), torch.Size([3, 61, 61]), torch.Size([3, 91, 91]), torch.Size([3, 83, 83]), torch.Size([3, 58, 58]), torch.Size([3, 91, 91]), torch.Size([3, 80, 80]), torch.Size([3, 76, 76]), torch.Size([3, 97, 97]), torch.Size([3, 102, 102]), torch.Size([3, 67, 67]), torch.Size([3, 92, 92]), torch.Size([3, 52, 52])], [(), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), ()]]
  warn(message)
You can deactivate this warning by passing `no_check=True`.
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
d:\Anaconda3\lib\site-packages\fastai\data_block.py in _check_kwargs(ds, tfms, **kwargs)
    537         x = ds[0]
--> 538         try: x.apply_tfms(tfms, **kwargs)
    539         except Exception as e:

d:\Anaconda3\lib\site-packages\fastai\vision\image.py in apply_tfms(self, tfms, do_resolve, xtra, size, resize_method, mult, padding_mode, mode, remove_out)
    100         xtra = ifnone(xtra, {})
--> 101         tfms = sorted(listify(tfms), key=lambda o: o.tfm.order)
    102         default_rsz = ResizeMethod.SQUISH if (size is not None and is_listy(size)) else ResizeMethod.CROP

d:\Anaconda3\lib\site-packages\fastai\vision\image.py in <lambda>(o)
    100         xtra = ifnone(xtra, {})
--> 101         tfms = sorted(listify(tfms), key=lambda o: o.tfm.order)
    102         default_rsz = ResizeMethod.SQUISH if (size is not None and is_listy(size)) else ResizeMethod.CROP

AttributeError: 'list' object has no attribute 'tfm'

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-11-d0c054d109fc> in <module>
      1 data = (ImageDataBunch.from_df("./", df=df2, num_workers=0,)\
      2 .label_from_df(cols=1, label_cls=FloatList)
----> 3 .transform(tfms = get_transforms(), size=224))
      4 
      5 data.normalize(imagenet_stats)

d:\Anaconda3\lib\site-packages\fastai\data_block.py in transform(self, tfms, tfm_y, **kwargs)
    661     def transform(self, tfms:TfmList, tfm_y:bool=None, **kwargs):
    662         "Set the `tfms` and `tfm_y` value to be applied to the inputs and targets."
--> 663         _check_kwargs(self.x, tfms, **kwargs)
    664         if tfm_y is None: tfm_y = self.tfm_y
    665         if tfm_y: _check_kwargs(self.y, tfms, **kwargs)

d:\Anaconda3\lib\site-packages\fastai\data_block.py in _check_kwargs(ds, tfms, **kwargs)
    538         try: x.apply_tfms(tfms, **kwargs)
    539         except Exception as e:
--> 540             raise Exception(f"It's not possible to apply those transforms to your dataset:\n {e}")
    541 
    542 class LabelList(Dataset):

Exception: It's not possible to apply those transforms to your dataset:
 'list' object has no attribute 'tfm'

sgugger · March 2, 2019, 3:44pm

ImageDataBunch.from_df is a factory method to construct a DataBunch. You wanted to write ImageList.from_df for the data block API I think.

fuzzydude16 · July 27, 2019, 8:40am

use custom head it solves the problem,
try this
learn = cnn_learner(data, models.resnet18,loss_func=nn.L1Loss, metrics=error_rate ,custom_head=nn.Sequential(Flatten(), nn.Linear(1,1)) )

kshitijpatil09 · March 20, 2020, 10:19pm

I’m trying to implement ‘BBox only’ part of lesson 8 from 2018 course using fastai2. I’m not sure if it’s relevant but posting it here just because I’m getting the exact same error:

My implementation:

def get_bbox_dls(df,sz=128,bs=128):
  getters = [lambda o: path/'train'/o,\
             lambda o: img2bbox[o],
             lambda o: ['']]
  dblock = DataBlock(
      blocks=(ImageBlock,BBoxBlock, BBoxLblBlock),
      get_items=get_train_imgs,
      getters=getters,n_inp=1,
      splitter=RandomSplitter(seed=47),
      item_tfms=Resize(sz,method='squish'),
      batch_tfms=[*aug_transforms(),Normalize.from_stats(*imagenet_stats)])
  return dblock.dataloaders(df,bs=bs)

img2bbox has the mapping of img_path to bbox of largest object. Using BBoxLblBox to adapt underlying bb_pad implementation but model isn’t supposed to predict any label yet. I’ve modified L1Loss accordingly to work with ys:

class CustomL1Loss(nn.L1Loss):
    def forward(self, input, bbox_targets, lbl_targets):
      return F.l1_loss(input, bbox_targets, reduction=self.reduction)

Some debugging done.
The bbox_targets coming out of dataloaders are of shape (bs,1,4)

_,x,_ = dls.one_batch(); x.shape

Working with random tensors cause no issues with CustomL1Loss

cust_l1 = CustomL1Loss()
inp = torch.randn(8,1,4)
op = torch.randn(8,1,4)
op2 = torch.randn(8,1)
cust_l1(inp,op,op2)

EDIT: Here is the whole stack trace of error

RuntimeError                              Traceback (most recent call last)

<ipython-input-52-bb1de44e7349> in <module>()
----> 1 learn.fit_one_cycle(1,lr_max=1e-4)

7 frames

/usr/local/lib/python3.6/dist-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    110     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    111               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 112     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    113 
    114 # Cell

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    188                     try:
    189                         self.epoch=epoch;          self('begin_epoch')
--> 190                         self._do_epoch_train()
    191                         self._do_epoch_validate()
    192                     except CancelEpochException:   self('after_cancel_epoch')

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _do_epoch_train(self)
    161         try:
    162             self.dl = self.dls.train;                        self('begin_train')
--> 163             self.all_batches()
    164         except CancelTrainException:                         self('after_cancel_train')
    165         finally:                                             self('after_train')

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in all_batches(self)
    139     def all_batches(self):
    140         self.n_iter = len(self.dl)
--> 141         for o in enumerate(self.dl): self.one_batch(*o)
    142 
    143     def one_batch(self, i, b):

/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in one_batch(self, i, b)
    147             self.pred = self.model(*self.xb);                self('after_pred')
    148             if len(self.yb) == 0: return
--> 149             self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')
    150             if not self.training: return
    151             self.loss.backward();                            self('after_backward')

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
     83 
     84     def __init__(self, size_average=None, reduce=None, reduction='mean'):
---> 85         super(L1Loss, self).__init__(size_average, reduce, reduction)
     86 
     87     def forward(self, input, target):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
     10         super(_Loss, self).__init__()
     11         if size_average is not None or reduce is not None:
---> 12             self.reduction = _Reduction.legacy_get_string(size_average, reduce)
     13         else:
     14             self.reduction = reduction

/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py in legacy_get_string(size_average, reduce, emit_warning)
     34         reduce = True
     35 
---> 36     if size_average and reduce:
     37         ret = 'mean'
     38     elif reduce:

RuntimeError: bool value of Tensor with more than one value is ambiguous

sgugger · March 21, 2020, 2:42am

As we very often say, please do not post just the last part of the error message but the whole stack trace.

kshitijpatil09 · March 21, 2020, 9:16am

I’ve posted whole stack trace for your reference

sgugger · March 21, 2020, 1:12pm

It looks like your loss is not properly initialized: instead of getting initialized with the () at the beginning, it takes the tensors at your call of cust_l1 and send them in the init.

kshitijpatil09 · March 21, 2020, 2:51pm

My bad I passed in loss to learner as
learn = cnn_learner(dls,resnet34,loss_func=CustomL1Loss)
instead of
learn = cnn_learner(dls,resnet34,loss_func=CustomL1Loss())

One more things I observed is, such kind of mistakes also causes Cuda:Out of Memory error, since, in my case, a variable supposed to hold a bool value was trying to store a 3D Tensor with new instance for every iteration.

So I’d say “Cuda:Out of memory” could be a hint that you’re doing something similar.