Summary of the 12/07/2019 Kaggle meetup
We had an interesting and wide ranging discussion. There were 12 attendees. Here is my recollection; please correct me or add anything important that I missed. Apologies in advance if I mis-attributed anyone’s comments or contributions!
Yuri spoke about a kernel that he is working with, at https://github.com/anovik/Kaggle-Aptos-2019-Blindness-Detection from Anna Novikova
Mehul spoke about the EfficientNetB4 FastAI - Blindness Detection kernel: https://www.kaggle.com/hmendonca/efficientnetb4-fastai-blindness-detection
We talked a bit about using the SMOTE package for imbalanced data sets
Srinivas mentioned a very similar Kaggle competition in the past that had 30,000 images compared to the 3000 in this competition, and reflected that we might look at that competition for ideas applicable to this one.
Someone asked why we normalize to ImageNet rather than to the training data itself, and I said that it is because we are doing transfer learning using the pretrained weights from ImageNet, but that it would be interesting to try both ways to verify.
Mehul had the idea to compile a list of experiments that we could do.
Experiment #1 Normalize with ImageNet vs. normalize with training set
Srinivas spoke about Focal Loss, an idea developed by Facebook in 2017 https://arxiv.org/abs/1708.02002. Focal Loss is a sort of modified cross-entropy loss that ‘focuses’ the model on hard examples by dynamically attenuating the loss for easy examples. Here, hard/easy refer to examples with low/high degree of confidence in the classification. Focal loss is helpful for class-imbalanced problems.
Srinivas mentioned a kernel called “be careful what you train on” https://www.kaggle.com/taindow/be-careful-what-you-train-on which shows how you can fool yourself by believing a model that gets good results for the wrong reasons.
We then talked about SHAP values, a principled method (originating in game theory) to quantitatively assign how much to “blame” each feature for a classification decision. Kind of like “importance” but much better.
Here’s the original paper: https://arxiv.org/abs/1705.07874
Here’s a tutorial on Kaggle: https://www.kaggle.com/dansbecker/shap-values