Prediction of a scalar with a CNN


#21

Would you be willing to provide an example of predicting scalars from images using the standard ImageDataBunch.from_df() ?

I have a very similar problem. I have a data frame with two columns, the first “filename” and the second “age”, saved as a string object and int64 respectively.

When I run the following code:

data = ImageDataBunch.from_df(path=path, df=df, size=224, bs=64)
arch = models.resnet34
learn = create_cnn(data, arch, metrics=MSELossFlat)
lr = 5e-2
learn.fit_one_cycle(1, slice(lr))

I get the same error message as jamesp:
RuntimeError: bool value of Tensor with more than one value is ambiguous

EDIT: I think I figured it out. This blog post/code are very helpful:

The key is to use ‘label_cls=FloatList’ when constructing the labels. This tells fastai to expect a regression problem, e.g:

data = (ImageItemList.from_csv(path, ‘csv_file’, folder=’’, suffix=’’)
.random_split_by_pct(0.2)
.label_from_df(cols=1, label_cls=FloatList)
.transform(tfms, size=224)
.databunch())
data.normalize(imagenet_stats)


(Dilip Thiagarajan) #22

Could anyone provide some insight on how this might be extended to predict multiple scalar outputs? I’m having trouble figuring out a way to do this.

Another post with the same question that is unresolved.


#23

If you pass a list instead of 1 to cols, it should work.


(Lankinen) #24

Not working on my problem. I needed to change a lot of code and I think now it is not working for other problems. I will send code here if everything worked well.


#25

Should be fixed in master now.


(Lankinen) #27

I’m acctualy using this for text data so I’m not sure is this something related to that but I got this error when I tried to train the model.

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'other'

targs: tensor([[0., 0., 1.],
[0., 1., 0.],
[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 1., 0.]], device=‘cuda:0’)
input: tensor([[2],
[2],
[0],
[0],
[0],
[0]], device=‘cuda:0’)

Targets can be anything from 0 to 1 and I need three different values.

data = (TextList.from_csv('.', text_data_with_three_targets.csv', vocab=data.vocab)
             .random_split_by_pct()
             .label_from_df(cols=[1,2,3],label_cls=FloatList)
             .databunch(bs=32,no_check=True))

(Dilip Thiagarajan) #29

When I try that, I run into the following error when I try to create my DataBunch:

TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.

This occurs when the summary (repr) of the DataBunch is about to be outputted, specifically in tensor() method, where there still seems to be an issue as referred to in this line:
# XXX: Pytorch bug in dataloader using num_workers>0; TODO: create repro and report

How can I get around this?


#30

It’s hard to help without the rest of your code and the full error message.


(Lankinen) #31

I will provide those tomorrow and I will also try to debug it by myself.


(Dilip Thiagarajan) #32

I’ve put my example here, with the error message shown in the 4th cell.


#33

You don’t have the latest code, so it’s normal it doesn’t work.


(Lankinen) #36
path = Path('/path/to/texts')
data = (TextList.from_folder(path,extensions='.txt')
            .random_split_by_pct(0.1)
            .label_for_lm()
            .databunch(bs=32))
learn = language_model_learner(data, pretrained_model=URLs.WT103_1, drop_mult=0.3)
learn.load(Path('/path/to/fine_tuned/language_model'))

data2 = (TextList.from_csv('.', 'example.csv', vocab=data.vocab)
             .random_split_by_pct()
             .label_from_df(cols=[1,2,3],label_cls=FloatList)
             .databunch(bs=32))

learn = text_classifier_learner(data2,drop_mult=0.5)
learn.load_encoder('/path/to/fine_tuned_enc')
learn.freeze()
learn.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))

Error message:
RuntimeError Traceback (most recent call last)
in ()
----> 1 learn.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))

~/Downloads/fastai-master (1)/fastai-master/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     21                                         pct_start=pct_start, **kwargs))
---> 22     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     23 
     24 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

~/Downloads/fastai-master (1)/fastai-master/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    171         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    172         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 173             callbacks=self.callbacks+callbacks)
    174 
    175     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/Downloads/fastai-master (1)/fastai-master/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     93     except Exception as e:
     94         exception = e
---> 95         raise e
     96     finally: cb_handler.on_train_end(exception)
     97 

~/Downloads/fastai-master (1)/fastai-master/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     88             if not data.empty_val:
     89                 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 90                                        cb_handler=cb_handler, pbar=pbar)
     91             else: val_loss=None
     92             if cb_handler.on_epoch_end(val_loss): break

~/Downloads/fastai-master (1)/fastai-master/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
     53             if not is_listy(yb): yb = [yb]
     54             nums.append(yb[0].shape[0])
---> 55             if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
     56             if n_batch and (len(nums)>=n_batch): break
     57         nums = np.array(nums, dtype=np.float32)

~/Downloads/fastai-master (1)/fastai-master/fastai/callback.py in on_batch_end(self, loss)
    248         "Handle end of processing one batch with `loss`."
    249         self.state_dict['last_loss'] = loss
--> 250         stop = np.any(self('batch_end', not self.state_dict['train']))
    251         if self.state_dict['train']:
    252             self.state_dict['iteration'] += 1

~/Downloads/fastai-master (1)/fastai-master/fastai/callback.py in __call__(self, cb_name, call_mets, **kwargs)
    196     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    197         "Call through to all of the `CallbakHandler` functions."
--> 198         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    199         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    200 

~/Downloads/fastai-master (1)/fastai-master/fastai/callback.py in <listcomp>(.0)
    196     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    197         "Call through to all of the `CallbakHandler` functions."
--> 198         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    199         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    200 

~/Downloads/fastai-master (1)/fastai-master/fastai/callback.py in on_batch_end(self, last_output, last_target, **kwargs)
    283         if not is_listy(last_target): last_target=[last_target]
    284         self.count += last_target[0].size(0)
--> 285         self.val += last_target[0].size(0) * self.func(last_output, *last_target).detach().cpu()
    286 
    287     def on_epoch_end(self, **kwargs):

~/Downloads/fastai-master (1)/fastai-master/fastai/metrics.py in accuracy(input, targs)
     29     print('targs:',targs)
     30     print('input:',input)
---> 31     return (input==targs).float().mean()
     32 
     33 def accuracy_thresh(y_pred:Tensor, y_true:Tensor, thresh:float=0.5, sigmoid:bool=True)->Rank0Tensor:

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'other'

example.csv
image


#37

Ah, this is because of the metrics always set to accuracy. You can remove it by passing metrics=[] in your learner creation. I’ll fix it this morning.


(Lankinen) #38

Thank you @sgugger ! It is working now.

Do you think there might be idea to create own accuracy for this kind of multi output problem? It could calculate the accuracy for every pair and then calculate the mean of those.