I have a regression problem with a tabular data. Hence, the loss function which I am using is MSELossFlat and the metric is mean_squared_error. These should obviously give the same result on the validation set. However, it is not the case. The output is:
Ok, it seems that there is a bug when the metric is computed. The metric is computed in mean_squared_error function in the metrics.py file. The shape of the predictions is (bs,1) and the shape of the targets is (bs,). Because of the broadcasting the computation of MSE becomes incorrect.
To solve the problem it is enough to use .squeeze() for the prediction tensor, however other metric functions may also suffer from the same bug. Maybe more general solution should be developed.
I ran into the same issue with the mismatched prediction and target shapes while trying to run a mean absolute error. The broadcasting caused it to compare each target to every prediction then take the average which gave some really high error rates (essentially equivalent to guessing the average target for all values). I wrote a new function that used .view() to reshaped the tensors so they were consistent and that seemed to work. I’ll have to give squeeze a try.
Is the bug fixed? If not, could you please tell me which part of the source code I need to revise to use .squeeze() ? I am having the same issue. Thanks.