Edit:
I’ve managed to find the main issue with my approach - I mindlessly split train to train and validation sets, not realizing my model couldn’t learn some of the classes this way (found out thanks to @KarlH). I also didn’t really analyze the dataset, which could help me better understand it. And I didn’t even bother trying bounding boxes.
It was an interesting competition in that it was such a different dataset. Single images are so hard to predict against. In fact, you could score 32% just by saying every whale in the test data was a ‘new_whale’. And there were duplicate images in test and train sets, if you just labelled these you reach about 0.40. Not sure why only 0.24 for you.
I used 500 example bounding boxes from the discussion to train a fluke finder (lesson 8) and cropped to those co-ords. I squeezed the image into a square, since square crops of a rectangle wouldn’t seem to be useful. I turned the images b&w. I ran a densenet classifier and this was enough for about .47 LB. Looking at the results, it found nearly all the duplicates (test/train) as you’d expect, so in other words of the 60% non new whales, it only found 7/60%. I augmented the dataset to have 6 images of each whale, with image transforms, raising to 0.50.
I looked at the predictions, and it seemed to be guessing shape more than texture, or even shape as a result of camera angle or pose. I didn’t spend much time on it, but I would guess an approach would be to do image segmentation to remove the background, then somehow direct the trainer to regions like the trailing edges, notch, and the centre of each side to ‘fingerprint’, but this is getting too much into feature engineering for my taste.
I tried siamese networks in pytorch but couldn’t even reach a similar result. The winner’s result looks interesting, I will have to walk through it.
Also not sure why only 0.24, might share notebook if anyone would be interested. I tried converting all images to b&w (in RGB, not grayscale) but strangely enough the model performed worse (about 20% worse according to validation set).
Yeah, the bounding boxes seemed like a way to go according to Kaggle kernels in this competition, but I haven’t yet started the part 2 of fast.ai deep learning course. In fact, this was the first dataset where I applied my knowledge from part 1.
I am considering checking out the winner’s approach too (he published a kernel when the competition ended), but I can’t stop the feeling, that I did something wrong, cause my result is just bad.
Thanks for reply, I will try to analyze my model and see what might’ve gone wrong
This is my solution. I didn’t do anything special in terms of training the model. In later iterations I used the bounding box data posted by other users, but this only gave a minor performance boost (like 0.41 to 0.43).
What do you mean when you say you added new_whale everywhere?
I always submitted 5 answers for each whale entry in validation / test set and if there was no new_whale between these answers, I would manually replace the 5th answer with new_whale. This got me from 0.22 to 0.24 on test set.
Great! Thanks for sharing, I’ll have a look at it. I’ll share mine too: