I’m working on predicting some sports matches using the tabular learner and I’m struggling with overfitting. Because I’m using historical matches I can’t get more data as I’m already using all of the matches that occured. I’m not sure how to implement augmentation for this kind of tabular data, if you have suggestions I’m all ears. I’m pretty sure that I’m using a general architecture. In this notebook you can see the grid search I ran on regularization parameters, yet I never hit a combination that wasn’t overfitting. I’ve tried reducing the complexity a lot from what I was initially using based on the Rossmann architecture, I really thought I needed a more complex architecture because my dataset is much bigger, it’s about 350 parameters and 30k examples.
I would greatly appreciate any suggestions for how to deal with this overfitting. Thanks a lot!