This is fast approximate nearest neighbors…the guy who built the nearest neighbors for spotify, also invented a method called annoy benchmarked all these methods - nmslib is the most incredibly fast way to find nearest neighbors in high dim vector space.

The other way to do this is k-means clustering (or even better, vector quantization) - but for this example, fast approx KNNs work really fast.