@sermakarevich Great, thanks
I am afraid of asking this kind of basic question, but what did you average (weights, probs, or…)?, and how did you average them in practice?
Yep, thats really scary question In practice it looks like this:
- I have 5 predictions for a test set because I do 5-fold CV
- averaging CV predictions for test set for each model so at the end I have a single test set prediction for each training config (model / image size)
- through CV i get train set predictions as well - this allows me to check how should I average predictions from different models (you might end up with just a mean, or median, or weights, or blending with additional model) and better understand all models accuracy as whole train set is better than validation set
- averaging test predictions from different models
Amazing! Thanks @sermakarevich!
I’m in 11th place, and I’ll try your approach.
How do you do a 5-fold CV with the fastai lib?
fastai students gonna rock it
It turned out to be pretty easy with sklearn StratifiedKFold
indexes with ImageClassifierData.from_csv
method. You just need to define val_idxs
parameter.
Thanks @sermakarevich ! Just qq.
You mean you did CV for test set rather than training/validation?
That’s pretty cool Thanks!
BTW, are there any standard methods of ensembling multiple models or architectures, I mean re “weights or probabilities” or “mean or median”?
- splite train set into 5 parts with
sklearn StratifiedKFold
- 4 parts are used as train-1 set and 1 is used as valid-1 set
- this is done by StratifiedKFold.split method which returns indexes for train-1 set (80% of original train) and indexes for valid-1 set (20% of original train)
- tune a model
- do TTA predictions for test and valid-1 (20% of train set)
- iterate through this 5 times
@jamesrequa knows this better than me. I used two different ways:
- just avg(sum(all predictions))
- extracted features from convolutional layers from different models are stacked together and only than I feed them into FC layer.
I’m asking because for the dog breed competition when I tried to ensemble three models by just simply averaging their probabilities, each log loss was around 0.21, the outcome jumped up high to around 13 . Thats why Thanks!
Check rows and columns ordering. 13 is definitely an error.
Thanks a lot, now it’s super clear
And, sure, I was wrong somewhere in the averaging process, I’ll try again
looks like half of top 20 is fastai students so far
why not get predictions for test by training on the whole dataset instead of CV ?
No reasons why not to. You only need to know how to optimise a model without a validation set. With CV one can achieve
- better understanding of accuracy
- get predictions for train set
- get mini ensemble for test set.
This mini ensemble gives 0.02 log loss improvement test vs train (which is 10%).
I’m assuming you mean a new model for each iteration, correct?
… and thanks for the detailed and nice writeup on using K-Fold CV!
How do I submit my results to Kaggle?
I ran some tests and built a decent classifier for my first submission, but it’s not clear to me how to get those predictions into a csv file for submitting.
Look at the last few lines of this Kernel for an example of that:
https://www.kaggle.com/orangutan/keras-vgg19-starter
One step they don’t do though is:
sub.to_csv(path+"filename.csv", index=False)
Ahh Pandas! Thanks!
Tune is the same as train in this context
Thanks @sermakarevich! I’ve got the 8th place just by getting the mean of some good models. =D