Kaggle: Google Landmark Recognition and Retrieval Challenges

Kaggle recently released two related challenges landmark-retrieval-challenge and landmark-recognition-challenge. They seem challenging both in theory and computing resources. About 1.2m images and 15K classes.

First of all, do you think we should join ? What is your opinion?

Any related references, how to approach to the problem and any discussions are also welcome…

1 Like

I was about to ask the same question…too bad yours is 26 days old and nobody gave any insight…

These two comptitions are difficult since this type of problem are really rare on Kaggle: the number of landmarks is 15k when a classical competition is about 5k classes and there are an important number of classes with only a few pictures (between 1 and 5 images) with some classes with some substantial number of images.

The write ups about the Kaggle cgdiscount competition shall be useful for the classification one.

Also, the organizers gave a clue: they suggested that the methods to solve the two competitions classification and similarity may help each other…

In fact MS coco 2017 stuff challenge is quite similar and some parts can be used combining with matterport’s Mask-RCNN. But even training coco 2017 stuff may take 2 days or so… So, I think it’s really difficult to test some new ideas. For 15k part I sow some similarity search methods for very large set of classes (I can’t remember for now). Also, RNN and scene descriptions can be used for classification as an idea. My view on this challenge is: time is to short and almost impossible for a single person to do something useful though one can learn a lot but not enough time.

I’m reading this paper: Deep Image Retrieval: Learning global representations for image search, there are many other paper refer to this paper, and it’s SOTA in landmark dataset.

Edit: Didn’t see this was for last year’s competition. Would be interested if anyone is looking into this years.

I am working on these competitions, and there are lots of exciting concepts to explore here. The hardest part is getting the data downloaded, but they have smaller datasets now.

Currently, I am trying to do several things.

  • Transfer learning as I ramp up the number of classifications. (Top 100 - 1000 - 10k… 200k! )
  • Created a GAP loss callback
  • Stratified validation set for my different sets
  • Exploring KNN and retinanet

The 4th place Winner last year used fast.ai, so I figure, it is an excellent place to start!