Prediction of a scalar with a CNN

I’m interested in using a CNN to predict a scalar value using MSE loss. (Example: a cat’s age from a photo of the cat.)

I was hoping to hook into the pre-existing tools, and it seems like modifying the _from_csv or _from_folder might be the right way to go. However, the data loaders seem to pretty strongly be label-oriented. Is creating a scalar learning approach as simple as creating a custom data loader that leaves the scalar value as-is (without making it into a label), then using MSE loss? Or are there more pieces here that assume labels?

1 Like

It’s more of a custom dataset you need to create. Just something that implements the methods

  • __len__ (returns the number of item)
  • __getitem__ (takes an index and returns x,y)

In your case, just take the example of the ImageDataset and it should work without any problem.

2 Likes

Thanks! I’ll check it out, and will share it here if I make something that works satisfactorily.

1 Like

Maybe try following the pascal notebook in DL2 on predicting a single bounding box. It modifies the ImageClassifierData to continuous by using continuous = True

md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, continuous=True, bs=4)

and creates a custom head that allows for predicting regression outputs (in this case the 4 bbox coordinates)

head_reg4 = nn.Sequential(Flatten(), nn.Linear(25088,4))
learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4) # custom head added
learn.opt_fn = optim.Adam
learn.crit = nn.L1Loss()

Perhaps in your case you need to change the final dense layer to nn.Linear(25088,1)) and learn.crit = nn.L2Loss()

2 Likes

Thank you, but I think that the pascal example is for the pre-v1 FastAI library. I’m trying to do everything under the new v1 API since I suspect that will have better longevity.

OMG is DL2 outdated? I thought I was still ahead of the curve :slight_smile:

1 Like

When trying to predict a scalar from an image (e.g., the age of a dog), I’m running into dimensional mismatch: input and target shapes do not match: input [100 x 1], target [100]. Hoping there is something obvious I’m doing wrong.

To create the Dataset, I’m consuming a pandas dataframe, telling it which column contains the file_path and which contains the scalar (dependent_variable), making sure that it’s an np.float32. I assigned mse_loss as the loss_func. Because I am subclassing ImageDataset, I need to tell it how many classes there are; I’m providing an array with one entry (0).

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))
        self.loss_func = torch.nn.functional.mse_loss
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]

I created a dataset for training, validation, and testing, then instantiated the DataBunch like so:

data = ImageDataBunch.create(dat_train, dat_valid, dat_test,
   ds_tfms=get_transforms(do_flip=False, max_warp=0), #max_rotate=0, 
    bs=100,
    size=256)

I then tried to run the ConvLearner:

learn2 = ConvLearner(data, 
                     tvm.resnet50, 
                     metrics=[accuracy], 
                     loss_fn=F.mse_loss,
                     callback_fns=ShowGraph)
learn2.lr_find(start_lr=1e-5, end_lr=10)
learn2.recorder.plot()

However, as noted above, I’m having dimensional issues:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-64-19da9fe76431> in <module>()
      5 #                      loss_fn=CrossEntropyFlat(torch.FloatTensor([0.5, 0.5]).cuda()),
      6                      callback_fns=ShowGraph)
----> 7 learn2.lr_find(start_lr=1e-5, end_lr=10)
      8 learn2.recorder.plot()

/app/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
     24     cb = LRFinder(learn, start_lr, end_lr, num_it)
     25     a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 26     learn.fit(a, start_lr, callbacks=[cb], **kwargs)
     27 
     28 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:

/app/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    136         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    137         fit(epochs, self.model, self.loss_fn, opt=self.opt, data=self.data, metrics=self.metrics,
--> 138             callbacks=self.callbacks+callbacks)
    139 
    140     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
     89     except Exception as e:
     90         exception = e
---> 91         raise e
     92     finally: cb_handler.on_train_end(exception)
     93 

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
     79             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     80                 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 81                 loss = loss_batch(model, xb, yb, loss_fn, opt, cb_handler)[0]
     82                 if cb_handler.on_batch_end(loss): break
     83 

/app/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_fn, opt, cb_handler, metrics)
     21 
     22     if not loss_fn: return to_detach(out), yb[0].detach()
---> 23     loss = loss_fn(out, *yb)
     24     mets = [f(out,*yb).detach().cpu() for f in metrics] if metrics is not None else []
     25 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction)
   1812     if size_average is not None or reduce is not None:
   1813         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 1814     return _pointwise_loss(lambda a, b: (a - b) ** 2, torch._C._nn.mse_loss, input, target, reduction)
   1815 
   1816 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in _pointwise_loss(lambd, lambd_optimized, input, target, reduction)
   1774         return torch.mean(d) if reduction == 'elementwise_mean' else torch.sum(d)
   1775     else:
-> 1776         return lambd_optimized(input, target, _Reduction.get_enum(reduction))
   1777 
   1778 

RuntimeError: input and target shapes do not match: input [100 x 1], target [100] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:12

Of note, when I modify this to consume typical image classification data instead of scalars, it works fine. And ideas?

1 Like

Just guessing, but would flattening solve the problem?

def mse_loss1(input, target): 
    input = input.view(-1) 
    return F.mse_loss(input, target) 

learn2 = ConvLearner(data, 
                     tvm.resnet50, 
                     metrics=[accuracy], 
                     loss_fn=mse_loss1,
                     callback_fns=ShowGraph)
1 Like

Wow, yes, that was exactly right @wyquek . It trains!
image

I should clarify that it runs, although it doesn’t seem to be extracting much information from the image!

While I don’t know if this is simply a function of my data set, I can say that I took a categorical value that the usual FastAI categorical model can predict with ~99% accuracy, turned it into a floating 0.0 or 1.0, and attempted prediction with this scalar approach.

The result was very little information gain (see scatterplot between predicted/truth below). Certainly a crossentropy loss is more “ideal” for such data, but a MSE loss should still extract some data. R^2 is ~0.11 (again, in data that is nearly perfectly classified when using the regular models). I’ll have to do more troubleshooting.

image

One thing that is interesting is that it seems to aggressively bias toward the majority “class”, even though the set is nearly balanced (~52% vs 48%):

image

I take it back. The training graphs look really ugly when I let them run longer, but a 24-cycle run nevertheless yielded good results on test data:

Predictions are now bimodal, as they should be:
image

Error centers around 0 (on a 0-1 scale):
image

The scatterplot doesn’t eyeball all that well:
image

…but all of the density is in the right place:
image

R^2 is 0.92 now.

FYI I added an MSELossFlat loss you can use. Please try it - I haven’t tested it yet.

1 Like

Edit: @sgugger pointed out that I needed to call MSELossFlat (I was missing the parens). You can skip to post #18 for the working Dataset.


I updated fastai (I updated the code every couple days, but not every single day) and now neither @wyquek’s code nor yours will run for me. With the same architecture as above, I now get an error about RuntimeError: bool value of Tensor with more than one value is ambiguous:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-60-ec98ea40e834> in <module>()
      5 #                      loss_func=data.loss_func,
      6                      callback_fns=ShowGraph)
----> 7 learn2.lr_find(start_lr=1e-7, end_lr=100)
      8 learn2.recorder.plot()

/app/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
     25     cb = LRFinder(learn, start_lr, end_lr, num_it)
     26     a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 27     learn.fit(a, start_lr, callbacks=[cb], **kwargs)
     28 
     29 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:

/app/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    135         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    136         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 137             callbacks=self.callbacks+callbacks)
    138 
    139     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87     except Exception as e:
     88         exception = e
---> 89         raise e
     90     finally: cb_handler.on_train_end(exception)
     91 

/app/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     77             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     78                 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 79                 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)[0]
     80                 if cb_handler.on_batch_end(loss): break
     81 

/app/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     20 
     21     if not loss_func: return to_detach(out), yb[0].detach()
---> 22     loss = loss_func(out, *yb)
     23 
     24     if opt is not None:

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
    419     """
    420     def __init__(self, size_average=None, reduce=None, reduction='elementwise_mean'):
--> 421         super(MSELoss, self).__init__(size_average, reduce, reduction)
    422 
    423     def forward(self, input, target):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in __init__(self, size_average, reduce, reduction)
     13         super(_Loss, self).__init__()
     14         if size_average is not None or reduce is not None:
---> 15             self.reduction = _Reduction.legacy_get_string(size_average, reduce)
     16         else:
     17             self.reduction = reduction

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in legacy_get_string(size_average, reduce, emit_warning)
     45             reduce = True
     46 
---> 47         if size_average and reduce:
     48             ret = 'elementwise_mean'
     49         elif reduce:

RuntimeError: bool value of Tensor with more than one value is ambiguous

I haven’t changed my Dataset (except to experiment with MSELossFlat) from the original, which is:

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))
        self.loss_func = layers.MSELossFlat
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]

On the plus side, I get the same error whether I use @jeremy’s or @wyquek’s loss function, so I suspect there was some other internal change that my Dataset is not handling correctly. Perhaps something about how I am manually setting self.classes = [0]?

Can you do a %debug and print the value of yb when you go up three times (in the line loss = loss_func(out, *yb). It seems from the rest of the error message that you may have a target that is a list.

> /app/fastai/fastai/basic_train.py(22)loss_batch()
     20 
     21     if not loss_func: return to_detach(out), yb[0].detach()
---> 22     loss = loss_func(out, *yb)
     23 
     24     if opt is not None:

ipdb> yb
[tensor([0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 1., 0., 0., 0., 1.,
        1., 1., 0., 1., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 1., 0., 1., 1.,
        1., 1., 0., 1., 1., 1., 1., 0., 1., 0., 1., 1., 1., 1., 0., 1., 0., 0.,
        1., 1., 1., 1., 1., 1., 0., 1., 0., 1., 1., 0., 1., 1., 1., 0., 0., 0.,
        0., 1., 1., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 0., 0.,
        0., 1., 0., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 1., 1., 0., 1., 1.,
        1., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 1., 0., 1., 1., 0., 0.,
        1., 0.], device='cuda:0')]
ipdb> type(yb)
<class 'list'>

Hmm, yes indeed. But my Dataset is feeding this to the super() class:

super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))

Where the righthand argument is:

np.array(merged[trait], dtype=np.float32).shape

Which looks like:

(18906,)

No it’s normal (there’s only one element, I thought it would be more).
Ah, I think I understand your problem: did you put loss_func = MSELoss() (with parenthesis)?

2 Likes

Oh. No, I did not!

Welp, that fixed it. Thank you. Updated code, internal notes-to-self and all:

class ImageScalarDataset(ImageDataset):
    def __init__(self, df:DataFrame, path_column:str='file_path', dependent_variable:str=None):
        
        # The superclass does nice things for us like tensorizing the numpy
        # input
        super().__init__(df[path_column], np.array(df[dependent_variable], dtype=np.float32))

        # Old FastAI uses loss_fn, new FastAI uses loss_func
        self.loss_func = layers.MSELossFlat()
        self.loss_fn = self.loss_func

        # We have only one "class" (i.e., the single output scalar)
        self.classes = [0]

    def __len__(self)->int:
        return len(self.y)
    
    def __getitem__(self, i):
        # return x, y | where x is an image, and y is the scalar
        return open_image(self.x[i]), self.y[i]
1 Like

Hi - I have a similar use case to yours. I want to classify the age of a photo and have processed the data to have the age number as the filename. My approach was to build the image data bunch and just use a different loss function (MSELoss rather than Cross Entropy) and hoped that would do it but ran into issues:

data = (ImageFileList.from_folder(path_img)
        .label_from_func(get_float_labels)
        .random_split_by_pct(valid_pct=0.2)
        .datasets()
        .transform(get_transforms(), size=224)
        .databunch(bs=bs).normalize(imagenet_stats)
       )

My classes look like this

class MSELossFlat2(nn.MSELoss):
    "Same as `nn.MSELoss`, but flattens input and target."
    def forward(self, input:Tensor, target:Tensor) -> Rank0Tensor:
        return super().forward(input.view(-1).float(), target.view(-1).float() )

learn = create_cnn(data, models.resnet34)
learn.loss_func = MSELossFlat2()
learn.fit_one_cycle(4)

If I don’t modify my MSELoss then I get an error about expecting a Float and getting a long form target. If i do this I get a mismatch error:

RuntimeError: input and target shapes do not match: input [7040], target [64] at /opt/conda/conda-bld/pytorch-nightly_1540036376816/work/aten/src/THCUNN/generic/MSECriterion.cu:12

Any tips would be appreciated. Thanks!

Did anyone work on this recently? I want to predict a scalar with an image too.

I have a dataframe with the filenames in one column and the scalar I want to predict on another column.

I can’t get the custom dataset to work. Where does the ImageDataset class come from?

Is there any other/easier way to get this to work in fastai v1 ? In fastai 0.7 you just had to change one argument to the ImageClassifierData factory method

EDIT:
nvm, I was just to stupid to save my scalar as a float instead of an integer. Now the last layer has one output like expected. It seems to work with the standard ImageDataBunch.from__df()