Part 2 lesson 11 wiki

This is fast approximate nearest neighbors…the guy who built the nearest neighbors for spotify, also invented a method called annoy benchmarked all these methods - nmslib is the most incredibly fast way to find nearest neighbors in high dim vector space.

The other way to do this is k-means clustering (or even better, vector quantization) - but for this example, fast approx KNNs work really fast.

Is this the source paper? : http://papers.nips.cc/paper/5204-devise-a-deep-visual-semantic-embedding-model.pdf

4 Likes

this devise thing was magical!

5 Likes

Personally, I enjoyed the last 20minutes like anything. The ease and the elegance with which the concept was explained and the simplicity of the implementation with what we already know from fastai library was quite phenomenal.

6 Likes

On the approximate nearest neighbors front, a friend has a great blog post outlining the tradeoffs of precision vs performance that’s definitely worth checking out.

One thing that’s worth noting is that if you’re building a production system FAISS on GPU is an order of magnitude faster than NMSLib. NMSLib runs at 200,000 QPS (Queries per second) in batch mode on the CPU (Core i7-7820x), and the GPU version of Faiss runs at 1,500,000 QPS on a 1080ti.

Faiss used to be a pain to setup but I believe they recently added pip support. I’m not sure if that includes the GPU aspect, but it’s worth considering if you need to do this at scale.

1 Like

The validation set doesn’t need to backprop, so it uses have the memory, so we can use a bigger batch size to run it faster.

2 Likes

Nope - seems reasonable :slight_smile:

1 Like

Without a 2nd linear layer it’s just a linear model, not a neural net!

2 Likes

Yes I expect so - I don’t know if it would help; it’s basically the same idea as cyclegan. I’m not really fully up to date on the translation literature, so dunno if this has been done before…

1 Like

Does anyone know the original attention paper that jeremy discussed…i think i missed that…he was mentioning 2 papers useful to read ??

There is an “attentive language model” paper that claims good results. I haven’t tried it.

5 Likes

https://arxiv.org/abs/1409.0473

2 Likes

An hour or so

1 Like

Yup. Rather foolishly I think they’re both required params, which isn’t ideal!

Can I ask a favor - the top wiki post hasn’t been edited yet, but there’s lots of good links and resources in the replies here. If anyone has a moment would you be so kind as to add them to the top post?

I need to go to bed! :slight_smile:

2 Likes

Oops turns out I forgot to make the top post into an actual wiki! Fixed now…

1 Like

Thanks for a great class, Jeremy!
Fish-in-nets : I’ve been waiting forever to hear you speak about DeVise again!

2 Likes

Why not cluster the data set. That’s the point.

I think the limit set for the loop is longest sentence length possible, so it will never truncate before EOS.

Jeremy mentioned in the video we can now download has the link from Kaggle. Does anyone have the link? I can find the object detection challenge but not the usual classification challenge.