First of all, many thanks, Jeremy and Rachel, for this great course!
About the problem we’re trying to solve. We have a big library of furniture object images (high quality renders, perspective view) and the goal is to find similar items for given piece of furniture. But the definition of ‘similarity’ in my case is a bit different - we’d rather have similarity by style of texture or colour and not that much by shape. For instance, finding an armchair with fabric texture similar to some given L-shaped sofa.
So far we managed to get nice clusters by shape. For instance, in a set of bar stools (570 pieces), we get them grouped nicely by shape. The approach we used is applying PCA, then t-SNE on the output of the last conv. layer of VGG16. But if another type of furniture (such as tables) is added there, I get separate clusters for tables and for chairs.
I was also thinking about using embeddings, but then the ‘vocabulary’ will be fixed to my furniture set. And I’d rather have something I can run on an image (say, a photo of a sofa) and get similarly textured images.
What approach would you use for such a task?
PS. Btw, this question is also a part of my assignment for Lesson 3