Anyone working on this competition? Looks like an ideal problem for applying the NLP text-classification transfer learning techniques @jeremy mentioned in Part 1. Also very similar to some of our client projects.
I’m planning on jumping into it in a few weeks once I get through some deadlines.
I’ve built a language model for the text. Next step is to try and classify against it.
Edit: first pass at classifying using a fastai and torchtext managed 0.72 on a single run. Now to find time to run a full cv set and further tune the language model.
I plan to join this in the next week or so – looks like a good one!
I’m also curious how effective the fastai language model + transfer learning approach is for text classification, especially compared to the more traditional approaches presented in the kaggle kernels.
I trained a language model (on 1/3rd of the application essays corpus to save myself some time on this test run) and put it through the default MultiBatchRNN-PoolingLinearClassifier architecture & settings from the IMDB lesson.
Got 0.766 public leaderboard score on this first test (in line with local val score: 0.768). Pretty impressed so far, especially given I’ve done zero feature engineering (other than spacy tokenization) and only used the text of the 4 essays + project_resource_summary as inputs.