Kaggle Comp: Plant Seedlings Classification

Thanks, @jamesrequa. Black-grass turned out to be the worst category with f1 at 0.87 only. Sad news is that local cv does not correlate with leaderboard.

2 Likes

Indeed - but can you see why?

(Iā€™ll give you a tip - do the same prelim analysis that I showed in class for the dog breeds comp, and it should be obvious).

1 Like

Jeremy, frankly I do not see an obvious answer so far. I have a couple of hypothesis but non of them if being true can help me to do something about it:

  • train set is relatively small
  • I do 5-fold cv, so I cant improve the accuracy of my validation score
  • images sizes vary significantly but still look similar train-test
  • we have one category with significantly lower predictive accuracy, but its imbalance can only make valid-leaderboard scores different, not uncorrelated
  • f1 score is step function, it shows improvement only if prediction for correct label becomes max amongst others

Obviously I am missing something

One of those 5 bullet points is incorrect - so you should check your assumptions :slight_smile:

1 Like

Thanks, Jeremy.

14 Likes

You should show that in a kaggle kernel or at least forum post @sermakarevich if you have a moment, since thatā€™s an important insight that we should make available to other competitors :slight_smile:

3 Likes

Done.

1 Like

another competation ,another rank thanks guys for your valuable inputs

5 Likes

Was the 3rd bullet point wrong?
Couldnā€™t follow which one is wrong?

Nice Visualisationsā€¦

Yep, it turned out I was checking train images width vs train images height :yum:

Can you post a link to the kaggle kernel here?

Nope, there is no kaggle kernel, just a post with two pictures which are already here, so nothing to share.

Is anyone ensembling models in this competition ? Iā€™ve tried averaging ove probabilities gotten from two models with very similar f score on leaderboard but it gave the worse result than either of those models.

Iā€™m averaging the probabilities of each k-fold. It improved a bit my score.

4 Likes

Have you compared that to simply retraining on the full dataset? If so, did you see much of a difference?

1 Like

Yes, I got 0.98488 (2nd place) on the leaderboard using the full dataset. To tell you the truth, itā€™s not clear to me why the averaging result was better (0.98992, 1st place), but it worked! :wink:

Maybe if I had just picked the best fold I could get a better score, I donā€™t know.

4 Likes

Thereā€™s a nice explanation here in case you havenā€™t seen it: https://mlwave.com/kaggle-ensembling-guide/

edit: Oh, you averaged the folds, I thought you were bagging :slight_smile:

1 Like

The test set contains 794 samples, however, when I run learn.TTA(is_test=True) or learn.predict(is_test=True), I keep getting 793 predictions instead of 794 (see attached ss). As a result, I am unable to create a submission file due to the mismatch in counts. The correct number of rows is 794 according to the Kaggle site. Has anyone else experience the same issue.

2 Likes

Yeah, I am facing this issue too

2 Likes

Can you ensure all files in os.listdir have extension .png ?