The other blog post I did on the hummingbird classifier https://redditech.blog/2018/11/04/fast-ai-deep-learning-for-coders-week-2-experiment-trinidad-and-tobago-hummingbird-classifier/
Finally! This was the callbacks post that got my excited…I can’t really remember why now I look at it.
A dataset with bounding boxes just got released (a couple of days ago): https://storage.googleapis.com/openimages/web/index.html
how does one become a ‘professional annotator’? asking for a friend.
Send them to Amazon Turk
I mentioned Gabor filters as a conceptual stepping-stone to understand how conv nets work.
Gabor filters are a signal processing / image processing tool (dating back long before DNNs) used to detect oriented visual features in an image. Basically you have this collection of filters of different wavelengths & orientations, and you apply each filter in turn to your image, measuring the response to each filter independently. Each filter detects structures in the image of different size & orientation. So the collection of responses gives you a fairly thorough analysis of the spatial structure in the image.
There is a great paper by Zeiler & Fergus (that has been mentioned in lectures, IIRC?) where they show that the lowest layers of a conv net are basically Gabor-like filters (in colour) that the DNN learned in response to the training. These lowest layers are the feature detector layers in the conv net.
I tried HOG descriptors. They work much better than ORB at identifying similar images. However, I think it’s a loosing battle trying to identify the right filters. My next step is to training a CNN on a diverse sample to learn a set of dynamic filters and then run each image through the network to generate embeddings. But first I have to label a few images I will use the existing HOG descriptors to cluster and ease the load
Tencent project released multi-label image database (~18mil images):
Very interesting. I enjoyed reading it as to how the absolute number of Draw and Away Win games are not much different but % predicted seems to be significantly different, which brings it to your (conspiracy?) theory - it makes sense to presume that the bookmakers hide the information signal in this way.
Yes that is my conspiracy theory. I cannot bring myself to believe that despite ~ 1/3 games ending in draws people ignore this and continue to bet on home or away outcomes instead. It is soccer not AFL