Kaggle Humpback Whale Identification - results, approaches

My approach
I did pretty much everything the same way as Jeremy did in lesson 2 notebook, except few things:

  • Changed aug_tfms to side_on
  • Created some metric functions
  • Added some functions for submission creation
  • Added new_whale everywhere except where it already was

My results
This got me to map@5 = 0.24369 on test set, which puts me into top 90% (obviously pretty poor)

So my question is: How did you approach the problem and what results did you get?

link to competition
link to my solution

I’ve managed to find the main issue with my approach - I mindlessly split train to train and validation sets, not realizing my model couldn’t learn some of the classes this way (found out thanks to @KarlH). I also didn’t really analyze the dataset, which could help me better understand it. And I didn’t even bother trying bounding boxes.

It was an interesting competition in that it was such a different dataset. Single images are so hard to predict against. In fact, you could score 32% just by saying every whale in the test data was a ‘new_whale’. And there were duplicate images in test and train sets, if you just labelled these you reach about 0.40. Not sure why only 0.24 for you.

I used 500 example bounding boxes from the discussion to train a fluke finder (lesson 8) and cropped to those co-ords. I squeezed the image into a square, since square crops of a rectangle wouldn’t seem to be useful. I turned the images b&w. I ran a densenet classifier and this was enough for about .47 LB. Looking at the results, it found nearly all the duplicates (test/train) as you’d expect, so in other words of the 60% non new whales, it only found 7/60%. I augmented the dataset to have 6 images of each whale, with image transforms, raising to 0.50.

I looked at the predictions, and it seemed to be guessing shape more than texture, or even shape as a result of camera angle or pose. I didn’t spend much time on it, but I would guess an approach would be to do image segmentation to remove the background, then somehow direct the trainer to regions like the trailing edges, notch, and the centre of each side to ‘fingerprint’, but this is getting too much into feature engineering for my taste.

I tried siamese networks in pytorch but couldn’t even reach a similar result. The winner’s result looks interesting, I will have to walk through it.


Also not sure why only 0.24, might share notebook if anyone would be interested. I tried converting all images to b&w (in RGB, not grayscale) but strangely enough the model performed worse (about 20% worse according to validation set).

Yeah, the bounding boxes seemed like a way to go according to Kaggle kernels in this competition, but I haven’t yet started the part 2 of fast.ai deep learning course. In fact, this was the first dataset where I applied my knowledge from part 1.

I am considering checking out the winner’s approach too (he published a kernel when the competition ended), but I can’t stop the feeling, that I did something wrong, cause my result is just bad.

Thanks for reply, I will try to analyze my model and see what might’ve gone wrong

1 Like

This is my solution. I didn’t do anything special in terms of training the model. In later iterations I used the bounding box data posted by other users, but this only gave a minor performance boost (like 0.41 to 0.43).

What do you mean when you say you added new_whale everywhere?


I always submitted 5 answers for each whale entry in validation / test set and if there was no new_whale between these answers, I would manually replace the 5th answer with new_whale. This got me from 0.22 to 0.24 on test set.

Great! Thanks for sharing, I’ll have a look at it. I’ll share mine too:

1 Like