Recognising overfitting in tabular models with a continuous dependent variable

Hi all,

As part of a university project, I have been attempting to quantify and measure overfitting in models trained using different parameter values. One issue I have ran into when beginning to work with tabular models is in how to determine whether a model is overfitting and on which epoch this occurs. Previously for vision problems I had compared the accuracies between epochs, however my dependent variable is a continuous value and I’m aware that the accuracy metric does not work in this instance. The only answers I have currently found have been to compare the accuracy of the training set and the test set to determine overfitting.
I have been considering comparing the mean absolute error between epochs or comparing the MAE between the training results and test results but I’m unsure if this is the correct approach to take.
Any help with the matter would be greatly appreciated!

First as you mentioned, try to monitor the graphs of train and valid/test loss to see the cases for overfitting. In most of the cases this would be sufficient to come to a conclusion.

Second, you can try to work with R2 value. As r-squared value intuitively gives you the amount of variance explained, if your model starts to overfit then R2 value would decrease and can give you some idea on overfitting.

Wouldn’t comparing the training and validation loss only indicate whether the model was underfitting or not?

I would have thought that surely if the model was beginning to overfit, the R2 value would be very high because the regression line matches the trends of the variables too closely for the data it is trained from.

No. I would recommend Leslie Smith’s report for visualizations on this. But the idea is if val decreases and then increases, it kind of shows overfitting. Details in the report.

4 Likes

Ah, I found the paper and gave it a read, I see your point now. Thank you so much for your help, that’s given me a good number of ideas and will be a fantastic resource for my dissertation write up!

1 Like