DonorsChoose Kaggle Competition

kevindewalt · March 6, 2018, 11:59am

Anyone working on this competition? Looks like an ideal problem for applying the NLP text-classification transfer learning techniques @jeremy mentioned in Part 1. Also very similar to some of our client projects.

I’m planning on jumping into it in a few weeks once I get through some deadlines.

digitalspecialists · March 6, 2018, 12:23pm

I’ve built a language model for the text. Next step is to try and classify against it.
Edit: first pass at classifying using a fastai and torchtext managed 0.72 on a single run. Now to find time to run a full cv set and further tune the language model.

memetzgz · March 7, 2018, 12:59am

I plan to join this in the next week or so – looks like a good one!

daveluo · March 15, 2018, 9:42pm

Hi all,

I’m also curious how effective the fastai language model + transfer learning approach is for text classification, especially compared to the more traditional approaches presented in the kaggle kernels.

I trained a language model (on 1/3rd of the application essays corpus to save myself some time on this test run) and put it through the default MultiBatchRNN-PoolingLinearClassifier architecture & settings from the IMDB lesson.

Got 0.766 public leaderboard score on this first test (in line with local val score: 0.768). Pretty impressed so far, especially given I’ve done zero feature engineering (other than spacy tokenization) and only used the text of the 4 essays + project_resource_summary as inputs.