How do I improve my Tabular Models on Fastai

ikey001 · August 13, 2019, 10:20am

Hi guys,

I need pointers to resources or parts of the video lectures where I can learn how to improve my tabular model.

For context I spent much of the weekend trying to build my very first ever model for this Kaggle competition just using what I’ve learnt from Lessons 1-4, and some parts of a Feature engineering book. I’ve also gone through different kernels and gotten some ideas about how to clean up data.

For this particular competition, so far I’ve managed to:

Remove columns that are empty, mostly empty or contain just one unique feature
re-sample the data so that I have a balanced sample of both classes
use tabular_learner train the model with to up to 84.xx% accuracy.

So I’ve hit the limit of what I know. I’ve seen a few kernels about creating feature engineering and creating/generating new features from the dataset but I imagined that the databunch will handle these by itself.

Can someone please point me to what else I can do to improve the model.

(I imagine I will be able to get some more out of it if I learnt how to visualize and interpret the data better, and I’m already reading books on that, I just want to know if there’s anything else I can be doing with the fastai library)

muellerzr · August 13, 2019, 11:15am

Go through and do as much feature engineering as you can, that will absolutely help. Along with feature importance to get rid of harmful variables can help, implementing dropout that you decline slowly can help as well, or weight decay. To my knowledge that’s mostly what you can do.

ikey001 · August 13, 2019, 11:29am

Thanks @muellerzr.

I try dropout and weight decay. I’ve tried removing features and I’ve sort of peaked. I saw a kernel about feature importance using RFECV, is there an equivalent of that in fastai?

muellerzr · August 13, 2019, 11:33am

There’s a permutation importance thread on the forum (can’t find the link right now but a little digging and you should be able to find it, I’m only on mobile right now)