Deal with overfitting of tabular model?

bpisaacoff · September 9, 2019, 1:06am

I’m working on predicting some sports matches using the tabular learner and I’m struggling with overfitting. Because I’m using historical matches I can’t get more data as I’m already using all of the matches that occured. I’m not sure how to implement augmentation for this kind of tabular data, if you have suggestions I’m all ears. I’m pretty sure that I’m using a general architecture. In this notebook you can see the grid search I ran on regularization parameters, yet I never hit a combination that wasn’t overfitting. I’ve tried reducing the complexity a lot from what I was initially using based on the Rossmann architecture, I really thought I needed a more complex architecture because my dataset is much bigger, it’s about 350 parameters and 30k examples.

I would greatly appreciate any suggestions for how to deal with this overfitting. Thanks a lot!

harikrishnanrajeev · October 23, 2020, 2:40pm

@bpisaacoff , hope you are doing great.

were you able to overcome your overfitting issues in tabular data. I am dealing with a similar issue and any insights , best practices will be greatly appreciated.