Part 2 lesson 11 wiki

narvind2003 · April 10, 2018, 4:02am

This is fast approximate nearest neighbors…the guy who built the nearest neighbors for spotify, also invented a method called annoy benchmarked all these methods - nmslib is the most incredibly fast way to find nearest neighbors in high dim vector space.

The other way to do this is k-means clustering (or even better, vector quantization) - but for this example, fast approx KNNs work really fast.

username_not_found · April 10, 2018, 4:02am

Is this the source paper? : http://papers.nips.cc/paper/5204-devise-a-deep-visual-semantic-embedding-model.pdf

divyansh · April 10, 2018, 4:20am

this devise thing was magical!

binga · April 10, 2018, 4:46am

Personally, I enjoyed the last 20minutes like anything. The ease and the elegance with which the concept was explained and the simplicity of the implementation with what we already know from fastai library was quite phenomenal.

Even · April 10, 2018, 4:51am

On the approximate nearest neighbors front, a friend has a great blog post outlining the tradeoffs of precision vs performance that’s definitely worth checking out.

One thing that’s worth noting is that if you’re building a production system FAISS on GPU is an order of magnitude faster than NMSLib. NMSLib runs at 200,000 QPS (Queries per second) in batch mode on the CPU (Core i7-7820x), and the GPU version of Faiss runs at 1,500,000 QPS on a 1080ti.

Faiss used to be a pain to setup but I believe they recently added pip support. I’m not sure if that includes the GPU aspect, but it’s worth considering if you need to do this at scale.

jeremy · April 10, 2018, 4:53am

The validation set doesn’t need to backprop, so it uses have the memory, so we can use a bigger batch size to run it faster.

jeremy · April 10, 2018, 4:54am

Nope - seems reasonable

jeremy · April 10, 2018, 4:57am

Without a 2nd linear layer it’s just a linear model, not a neural net!

jeremy · April 10, 2018, 5:01am

Yes I expect so - I don’t know if it would help; it’s basically the same idea as cyclegan. I’m not really fully up to date on the translation literature, so dunno if this has been done before…

visingh · April 10, 2018, 5:04am

Does anyone know the original attention paper that jeremy discussed…i think i missed that…he was mentioning 2 papers useful to read ??

jeremy · April 10, 2018, 5:07am

There is an “attentive language model” paper that claims good results. I haven’t tried it.

jeremy · April 10, 2018, 5:07am

https://arxiv.org/abs/1409.0473

jeremy · April 10, 2018, 5:07am

An hour or so

jeremy · April 10, 2018, 5:08am

Yup. Rather foolishly I think they’re both required params, which isn’t ideal!

jeremy · April 10, 2018, 5:10am

Can I ask a favor - the top wiki post hasn’t been edited yet, but there’s lots of good links and resources in the replies here. If anyone has a moment would you be so kind as to add them to the top post?

I need to go to bed!

jeremy · April 10, 2018, 5:11am

Oops turns out I forgot to make the top post into an actual wiki! Fixed now…

narvind2003 · April 10, 2018, 5:33am

Thanks for a great class, Jeremy!
Fish-in-nets : I’ve been waiting forever to hear you speak about DeVise again!

fmichaelkunz · April 10, 2018, 6:24am

Why not cluster the data set. That’s the point.

chunduri · April 10, 2018, 9:30am

I think the limit set for the loop is longest sentence length possible, so it will never truncate before EOS.

sgugger · April 10, 2018, 1:41pm

Jeremy mentioned in the video we can now download has the link from Kaggle. Does anyone have the link? I can find the object detection challenge but not the usual classification challenge.