So I have been searching for a way to combine and train tabular + text data with all the good stuff from fastai (databunch api, 1 cycle training, callbacks …) to do some Kaggle competitions and came across this awesome blog post by @wgpubs in which he created a tabular + text databunch using databunch API. I changed few line of codes to get it to work on fastai 1.0.51+ and then built a custom model to combine RNN/LSTM and MLP (neural net layers) for end-to-end training.
Tested on Kaggle Mercari competition, it trained successfully and loss did go down but very slowly and the overall result is kinda underwhelming (middle of the LB). I am planning to debug this during next week.
Anyway, I think it works and since there are lots of folks asking for a mixed databunch + model, I hope this can be a starting point. Here is the code and notebook: https://github.com/anhquan0412/fastai-tabular-text-demo
I’d love if someone can test it on other dataset to see whether it’s effective to combine the model the way I did. Thanks!