Tabular data, regression problem, MSE loss and mean_squared_error mismatch

Hello,

I have a regression problem with a tabular data. Hence, the loss function which I am using is MSELossFlat and the metric is mean_squared_error. These should obviously give the same result on the validation set. However, it is not the case. The output is:

|epoch|train_loss|valid_loss|mean_squared_error|

|1|    0.047028|   0.008886|   2.268168|
|2|    0.041831|   0.034416|   2.090547|
|3|    0.030867|   0.007301|   2.322598|
|4|    0.038007|   0.012626|   2.236360|
|5|    0.033619|   0.010785|   2.593846|

So, valid_loss and the mean_squared_error metric on the validation set are very different. Would would be the reason for that? Can it be bug.

This experiment s easily reproducible by running the tabular.ipynb in the example folder and modifying 2 cells. The third cell I modify as:

from sklearn import preprocessing
#df['salary'].unique()
df['salary'] = preprocessing.StandardScaler().fit_transform( (df['age'] + df['fnlwgt']).values.reshape(-1,1) ) 

In the cell with the learner definition I pass the loss and the metric explicitely, also change the number pf epochs:

from fastai.layers import MSELossFlat
from fastai.metrics import mean_squared_error

learn = tabular_learner(data, layers=[200,100], loss_func=MSELossFlat(), metrics=mean_squared_error)
learn.fit(5, 1e-2)

Do you have any thoughts on this behaviour?

Thanks,
Alex.

Ok, it seems that there is a bug when the metric is computed. The metric is computed in mean_squared_error function in the metrics.py file. The shape of the predictions is (bs,1) and the shape of the targets is (bs,). Because of the broadcasting the computation of MSE becomes incorrect.

To solve the problem it is enough to use .squeeze() for the prediction tensor, however other metric functions may also suffer from the same bug. Maybe more general solution should be developed.

1 Like

I ran into the same issue with the mismatched prediction and target shapes while trying to run a mean absolute error. The broadcasting caused it to compare each target to every prediction then take the average which gave some really high error rates (essentially equivalent to guessing the average target for all values). I wrote a new function that used .view() to reshaped the tensors so they were consistent and that seemed to work. I’ll have to give squeeze a try.

1 Like

Thanks for flagging, I’ll fix this for all metrics that need it sometime soon.

1 Like

Thanks, @sgugger. I have also created an issue on Github.

Is the bug fixed? If not, could you please tell me which part of the source code I need to revise to use .squeeze() ? I am having the same issue. Thanks.