Hi there. I am trying my skills in https://www.kaggle.com/c/data-science-bowl-2019 competition.
I am kind of stuck. One thing I noticed is that there’re no 1/2 predictions for the test dataset. I have been adding features but it hasn’t helped to improve a lot.
I see that in the train data there’re twice less 1/2 targets, but I doubt it should be that severe. My intuition is it’s something with the loss function and I want to try label smoothing or mixup next.
https://www.kaggle.com/manyregression/fastai-2019-data-science-bowl?scriptVersionId=24808940
I think that’s because it’s a regression problem, not classification.
So I improved my score by switching to Regression.
Now I want to have Kappa metric and my idea is to calculate it by passing preds through rounder
https://www.kaggle.com/manyregression/fastai-2019-data-science-bowl?scriptVersionId=25008584#KappaScoreRegression
But I don’t understand the implementation completely. While I am going through it, could anyone please point me on the key thing - where convert a predicted number to a class?
The competititon ended recently and I got in 32%, jumping up 1k places.
My solution is very simple https://www.kaggle.com/manyregression/private-fastai-2019-data-science-bowl Most solutions use ensembles of models, but I am not interested in that. I target on practicality and simplicity. I think if it’s in 32% then it works.
The most time I spent on figuring out what works and what doesn’t. Features, parameters, additional data. What helped the most is a good validation set.
At the end I was removing features without which the score stayed the same or better.
Interestingly, if I recall right the winner used transformer on text created from features.
3 Likes