However, according to Kaggle, my second best is 1.07, and I’m pretty sure that I didn’t use Adadelta back then. Unfortunately I have only lately understood Jeremy’s words of reproducibility and I don’t recall what methods I used back then (12 days ago).
Based on that, I tested out 20, 40, 100 clusters and 100 clusters still seemed reasonable clustering when looking at the images. So I decided to use it.
And using this information I created my validation set.
Not sure if this helps a lot or not in the competition, but I was really unsure about my validation set also.
when running the script from lesson 7 (specifically the one with the bounding boxes) the results seem good, however when I leave one ship out (as did on state-farm) the net doesn’t find anything… here is an example:
I am also trying the same approach but couldn’t get satisfying results. I want to know how many epochs did you run? and also what weights did you use in the layers after convolution layers?
I was able to reach only to 1.31 logloss score, please suggest what else should I try.
I created my validation set by putting aside 20% of images from each category. I understand that training and test data might be different, so the losses might not be similar, but there is no sort of relation between my val-loss and kaggle-scoreboard loss. Even if my val-loss decrease, kaggle scoreboard loss increases. It’s very challenging to train a model without any sort of reference.
So, what approach did you guys take to create a reliable validation set, and how different is it from the test set?
I used stratified KFold cross validation. You can then use the test predictions on your validation set as features for a meta model (although there is some information leakage). So you actually use all of the training data.
You can then combine the predictions of your k models for the test set (e.g.mean).
Doing this with a few models got me into top 10%, although I’ve probably fallen out since last time I checked. And that was without stacking, which would have been my next step.
A big thank you to Jeremy for the excellent guidelines and code!
I managed to get ~0.87 score with VGG pretrained features and Fully Convolutional Networks with lots of ensembling and data augumentation.
I also used bounding boxes(Lesson 7) and head tail data from here:
Has anyone used Resnet with good results on this dataset?
I am getting around ~1.7 val loss with Resnet.
@pradeepg – I got good results from resnet using keras pretrained weight; did not try the head and tail data yet… is that in the annotations ? I did not see it…
I just used resnet yesterday, and ended up getting an val_loss of 1.13. That was with the simple Dense layers, today I’ll be experimenting with the GlobalAveragePooling2D, A Fully Convolutional model and utilizing bounding boxes with Resnet. I’d be happy to share my results!
I trained a Fully Convolutional Network on the fisheries data. It worked well. Then i used data augmentation and created 5 times the training data to train a Fully convolutional network. It doesn’t seem to converge at all, even with very low learning rates. What might be the reason ?
Manoj, I had the same issue. I got much better results with no augmentation, and I tried a lot of different augmentation parameters. The augmentation worked well on the cats and dogs images, but not on the fisheries ones - I think perhaps because the images are too noisy with boat features. I think if you made a 2-stage model, the first where you identified the fish then the second in which the fish was cut out and then augmented you might have better results, I don’t know…
@Christina, What surprised me was that model wouldn’t converge at all, no matter what the learning rate was. Even if the images are noisy, does a model not converge at some point or other?.
Manoj,
I definitely get convergence, but accuracy isn’t that great… What are you using for your augmentation parameters? Here is an example of one of mine:
I actually ended up getting pretty decent results val_loss = 0.13 & a LB log_loss score of 0.9695, which makes me currently placed right above Jeremy’s last submitted score .
My approach is as follows,
I used the Resnet50 architecture, pre-computed the convolutional features for quick training.
Provided a Fully Convolutional model the output of the pre-computed convolutional features described above.
Training that model alone got me to a val_loss of 0.15.
Next I added pseudo labels and trained the fully connected model to obtain a val_loss of 0.13.
My final step was to perform data augmentation which led me to a val_loss of 0.15 but a high val_accuracy (0.96) and a 0.9695 log-loss on the leader board.