Share your work here ✅

After lesson 4, I was really keen to try building my own NLP / tabular / collaborative model.

As luck would have it, I got an email from Kaggle today announcing a new competition with tabular data. The competition requires you to use tabular data to predict whether Santander bank customers will buy products in the future. This sounded quite similar to the tabular example from the lecture, so it seemed like a good problem to try out.

Technical details:

  • I built a tabular model, which had ~91.6% accuracy on my validation set (training with 4 epochs at a max learning rate of 5e-3)
  • One challenge I ran into was that Kaggle scores this competition with an AUC-ROC metric (AUC = area Under Curve, ROC = Receiver Operating Characteristics). I tried to add this metric myself, and did a bit of Googling to try to find usable code, but wasn’t able to get it to work
  • I submitted my model to Kaggle with a score of 0.862, which got me to position 678/1067 on the leaderboard. I might come back to this model in the future after I learn some more optimization techniques (e.g. I don’t know exactly what the ‘layers’ input does when you create a TabularLearner, so I just put in the same [200,100] value from the lecture)
  • Code for the model is available at GitHub
8 Likes