Prediction of a scalar with a CNN

jamesp · October 18, 2018, 1:21am

I’m interested in using a CNN to predict a scalar value using MSE loss. (Example: a cat’s age from a photo of the cat.)

I was hoping to hook into the pre-existing tools, and it seems like modifying the _from_csv or _from_folder might be the right way to go. However, the data loaders seem to pretty strongly be label-oriented. Is creating a scalar learning approach as simple as creating a custom data loader that leaves the scalar value as-is (without making it into a label), then using MSE loss? Or are there more pieces here that assume labels?

sgugger · October 18, 2018, 1:30am

It’s more of a custom dataset you need to create. Just something that implements the methods

__len__ (returns the number of item)
__getitem__ (takes an index and returns x,y)

In your case, just take the example of the ImageDataset and it should work without any problem.

jamesp · October 18, 2018, 1:32am

Thanks! I’ll check it out, and will share it here if I make something that works satisfactorily.

wyquek · October 18, 2018, 2:15am

Maybe try following the pascal notebook in DL2 on predicting a single bounding box. It modifies the ImageClassifierData to continuous by using continuous = True

md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, continuous=True, bs=4)

and creates a custom head that allows for predicting regression outputs (in this case the 4 bbox coordinates)

head_reg4 = nn.Sequential(Flatten(), nn.Linear(25088,4))
learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4) # custom head added
learn.opt_fn = optim.Adam
learn.crit = nn.L1Loss()

Perhaps in your case you need to change the final dense layer to nn.Linear(25088,1)) and learn.crit = nn.L2Loss()

jamesp · October 18, 2018, 2:25am

Thank you, but I think that the pascal example is for the pre-v1 FastAI library. I’m trying to do everything under the new v1 API since I suspect that will have better longevity.

wyquek · October 18, 2018, 2:34am

OMG is DL2 outdated? I thought I was still ahead of the curve

jamesp · October 19, 2018, 4:48am

When trying to predict a scalar from an image (e.g., the age of a dog), I’m running into dimensional mismatch: input and target shapes do not match: input [100 x 1], target [100]. Hoping there is something obvious I’m doing wrong.

To create the Dataset, I’m consuming a pandas dataframe, telling it which column contains the file_path and which contains the scalar (dependent_variable), making sure that it’s an np.float32. I assigned mse_loss as the loss_func. Because I am subclassing ImageDataset, I need to tell it how many classes there are; I’m providing an array with one entry (0).

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))
        self.loss_func = torch.nn.functional.mse_loss
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]

I created a dataset for training, validation, and testing, then instantiated the DataBunch like so:

data = ImageDataBunch.create(dat_train, dat_valid, dat_test,
   ds_tfms=get_transforms(do_flip=False, max_warp=0), #max_rotate=0, 
    bs=100,
    size=256)

I then tried to run the ConvLearner:

learn2 = ConvLearner(data, 
                     tvm.resnet50, 
                     metrics=[accuracy], 
                     loss_fn=F.mse_loss,
                     callback_fns=ShowGraph)
learn2.lr_find(start_lr=1e-5, end_lr=10)
learn2.recorder.plot()

However, as noted above, I’m having dimensional issues:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-64-19da9fe76431> in <module>()
      5 #                      loss_fn=CrossEntropyFlat(torch.FloatTensor([0.5, 0.5]).cuda()),
      6                      callback_fns=ShowGraph)
----> 7 learn2.lr_find(start_lr=1e-5, end_lr=10)
      8 learn2.recorder.plot()

/app/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
     24     cb = LRFinder(learn, start_lr, end_lr, num_it)
     25     a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 26     learn.fit(a, start_lr, callbacks=[cb], **kwargs)
     27 
     28 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:

/app/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    136         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    137         fit(epochs, self.model, self.loss_fn, opt=self.opt, data=self.data, metrics=self.metrics,
--> 138             callbacks=self.callbacks+callbacks)
    139 
    140     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
     89     except Exception as e:
     90         exception = e
---> 91         raise e
     92     finally: cb_handler.on_train_end(exception)
     93 

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
     79             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     80                 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 81                 loss = loss_batch(model, xb, yb, loss_fn, opt, cb_handler)[0]
     82                 if cb_handler.on_batch_end(loss): break
     83 

/app/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_fn, opt, cb_handler, metrics)
     21 
     22     if not loss_fn: return to_detach(out), yb[0].detach()
---> 23     loss = loss_fn(out, *yb)
     24     mets = [f(out,*yb).detach().cpu() for f in metrics] if metrics is not None else []
     25 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction)
   1812     if size_average is not None or reduce is not None:
   1813         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 1814     return _pointwise_loss(lambda a, b: (a - b) ** 2, torch._C._nn.mse_loss, input, target, reduction)
   1815 
   1816 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in _pointwise_loss(lambd, lambd_optimized, input, target, reduction)
   1774         return torch.mean(d) if reduction == 'elementwise_mean' else torch.sum(d)
   1775     else:
-> 1776         return lambd_optimized(input, target, _Reduction.get_enum(reduction))
   1777 
   1778 

RuntimeError: input and target shapes do not match: input [100 x 1], target [100] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12

Of note, when I modify this to consume typical image classification data instead of scalars, it works fine. And ideas?

wyquek · October 19, 2018, 6:24am

Just guessing, but would flattening solve the problem?

def mse_loss1(input, target): 
    input = input.view(-1) 
    return F.mse_loss(input, target) 

learn2 = ConvLearner(data, 
                     tvm.resnet50, 
                     metrics=[accuracy], 
                     loss_fn=mse_loss1,
                     callback_fns=ShowGraph)

jamesp · October 19, 2018, 8:16pm

Wow, yes, that was exactly right @wyquek . It trains!

jamesp · October 19, 2018, 10:15pm

I should clarify that it runs, although it doesn’t seem to be extracting much information from the image!

While I don’t know if this is simply a function of my data set, I can say that I took a categorical value that the usual FastAI categorical model can predict with ~99% accuracy, turned it into a floating 0.0 or 1.0, and attempted prediction with this scalar approach.

The result was very little information gain (see scatterplot between predicted/truth below). Certainly a crossentropy loss is more “ideal” for such data, but a MSE loss should still extract some data. R^2 is ~0.11 (again, in data that is nearly perfectly classified when using the regular models). I’ll have to do more troubleshooting.

One thing that is interesting is that it seems to aggressively bias toward the majority “class”, even though the set is nearly balanced (~52% vs 48%):

jamesp · October 19, 2018, 10:35pm

I take it back. The training graphs look really ugly when I let them run longer, but a 24-cycle run nevertheless yielded good results on test data:

Predictions are now bimodal, as they should be:

Error centers around 0 (on a 0-1 scale):

The scatterplot doesn’t eyeball all that well:

…but all of the density is in the right place:

R^2 is 0.92 now.

jeremy · October 19, 2018, 11:39pm

FYI I added an MSELossFlat loss you can use. Please try it - I haven’t tested it yet.

jamesp · October 20, 2018, 1:28pm

Edit: @sgugger pointed out that I needed to call MSELossFlat (I was missing the parens). You can skip to post #18 for the working Dataset.

I updated fastai (I updated the code every couple days, but not every single day) and now neither @wyquek’s code nor yours will run for me. With the same architecture as above, I now get an error about RuntimeError: bool value of Tensor with more than one value is ambiguous:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-60-ec98ea40e834> in <module>()
      5 #                      loss_func=data.loss_func,
      6                      callback_fns=ShowGraph)
----> 7 learn2.lr_find(start_lr=1e-7, end_lr=100)
      8 learn2.recorder.plot()

/app/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
     25     cb = LRFinder(learn, start_lr, end_lr, num_it)
     26     a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 27     learn.fit(a, start_lr, callbacks=[cb], **kwargs)
     28 
     29 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:

/app/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    135         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    136         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 137             callbacks=self.callbacks+callbacks)
    138 
    139     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87     except Exception as e:
     88         exception = e
---> 89         raise e
     90     finally: cb_handler.on_train_end(exception)
     91 

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     77             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     78                 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 79                 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)[0]
     80                 if cb_handler.on_batch_end(loss): break
     81 

/app/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     20 
     21     if not loss_func: return to_detach(out), yb[0].detach()
---> 22     loss = loss_func(out, *yb)
     23 
     24     if opt is not None:

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
    419     """
    420     def __init__(self, size_average=None, reduce=None, reduction='elementwise_mean'):
--> 421         super(MSELoss, self).__init__(size_average, reduce, reduction)
    422 
    423     def forward(self, input, target):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
     13         super(_Loss, self).__init__()
     14         if size_average is not None or reduce is not None:
---> 15             self.reduction = _Reduction.legacy_get_string(size_average, reduce)
     16         else:
     17             self.reduction = reduction

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in legacy_get_string(size_average, reduce, emit_warning)
     45             reduce = True
     46 
---> 47         if size_average and reduce:
     48             ret = 'elementwise_mean'
     49         elif reduce:

RuntimeError: bool value of Tensor with more than one value is ambiguous

I haven’t changed my Dataset (except to experiment with MSELossFlat) from the original, which is:

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))
        self.loss_func = layers.MSELossFlat
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]

On the plus side, I get the same error whether I use @jeremy’s or @wyquek’s loss function, so I suspect there was some other internal change that my Dataset is not handling correctly. Perhaps something about how I am manually setting self.classes = [0]?

sgugger · October 20, 2018, 1:32pm

Can you do a %debug and print the value of yb when you go up three times (in the line loss = loss_func(out, *yb). It seems from the rest of the error message that you may have a target that is a list.

jamesp · October 20, 2018, 1:44pm

> /app/fastai/fastai/basic_train.py(22)loss_batch()
     20 
     21     if not loss_func: return to_detach(out), yb[0].detach()
---> 22     loss = loss_func(out, *yb)
     23 
     24     if opt is not None:

ipdb> yb
[tensor([0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 1., 0., 0., 0., 1.,
        1., 1., 0., 1., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 1., 0., 1., 1.,
        1., 1., 0., 1., 1., 1., 1., 0., 1., 0., 1., 1., 1., 1., 0., 1., 0., 0.,
        1., 1., 1., 1., 1., 1., 0., 1., 0., 1., 1., 0., 1., 1., 1., 0., 0., 0.,
        0., 1., 1., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 0., 0.,
        0., 1., 0., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 1., 1., 0., 1., 1.,
        1., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 1., 0., 1., 1., 0., 0.,
        1., 0.], device='cuda:0')]

ipdb> type(yb)
<class 'list'>

Hmm, yes indeed. But my Dataset is feeding this to the super() class:

super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))

Where the righthand argument is:

np.array(merged[trait], dtype=np.float32).shape

Which looks like:

(18906,)

sgugger · October 20, 2018, 1:46pm

No it’s normal (there’s only one element, I thought it would be more).
Ah, I think I understand your problem: did you put loss_func = MSELoss() (with parenthesis)?

jamesp · October 20, 2018, 1:46pm

Oh. No, I did not!

jamesp · October 20, 2018, 1:48pm

Welp, that fixed it. Thank you. Updated code, internal notes-to-self and all:

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        
        # The superclass does nice things for us like tensorizing the numpy
        # input
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))

        # Old FastAI uses loss_fn, new FastAI uses loss_func
        self.loss_func = layers.MSELossFlat()
        self.loss_fn = self.loss_func

        # We have only one "class" (i.e., the single output scalar)
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]

alvisanovari · November 11, 2018, 1:52am

Hi - I have a similar use case to yours. I want to classify the age of a photo and have processed the data to have the age number as the filename. My approach was to build the image data bunch and just use a different loss function (MSELoss rather than Cross Entropy) and hoped that would do it but ran into issues:

data = (ImageFileList.from_folder(path_img)
        .label_from_func(get_float_labels)
        .random_split_by_pct(valid_pct=0.2)
        .datasets()
        .transform(get_transforms(), size=224)
        .databunch(bs=bs).normalize(imagenet_stats)
       )

My classes look like this

class MSELossFlat2(nn.MSELoss):
    "Same as `nn.MSELoss`, but flattens input and target."
    def forward(self, input:Tensor, target:Tensor) -> Rank0Tensor:
        return super().forward(input.view(-1).float(), target.view(-1).float() )

learn = create_cnn(data, models.resnet34)
learn.loss_func = MSELossFlat2()
learn.fit_one_cycle(4)

If I don’t modify my MSELoss then I get an error about expecting a Float and getting a long form target. If i do this I get a mismatch error:

RuntimeError: input and target shapes do not match: input [7040], target [64] at /opt/conda/conda-bld/pytorch-nightly_1540036376816/work/aten/src/THCUNN/generic/MSECriterion.cu:12

Any tips would be appreciated. Thanks!

rasmus1610 · January 3, 2019, 10:43am

Did anyone work on this recently? I want to predict a scalar with an image too.

I have a dataframe with the filenames in one column and the scalar I want to predict on another column.

I can’t get the custom dataset to work. Where does the ImageDataset class come from?

Is there any other/easier way to get this to work in fastai v1 ? In fastai 0.7 you just had to change one argument to the ImageClassifierData factory method

EDIT:
nvm, I was just to stupid to save my scalar as a float instead of an integer. Now the last layer has one output like expected. It seems to work with the standard ImageDataBunch.from__df()