I am currently a student going through the fast.ai course (I did the first version last year).
I consult for a startup and they are looking to add some machine learning into their product matching algorithm.
The site is frootbat.com and is a alcohol price comparison marketplace.
If anyone is interested in working on this problem (paid of course) please reach out.
Here is an overview of the requirements
We are a marketplace startup that inhales millions of ‘products’ (a product being a string of text eg Grey Goose Vodka 750ml bottle) that we try to match to our master product data set. Currently we are using full text matching and rules to “clean up” text coming in from external sources, but this approach only gives us around a 30% match rate.
We are looking to use machine leaning to improve this match rate and of course maintain the quality of the matches. We are hoping the model will help increase our match % and with each new match the AI should be smarter. Eventually we hope to achieve a match rate of 80-90%
The current data set is around 1 million matches (or labelled data) which will form the basis of the training set and then we have around 2.5 million that need a match
The ground truth is the name of the product. The price of the product however is collected from the external source and will be used in the model. For example if a proposed match is made between an external product and a master product but the price of the external product is a long way outside the normal range of the master product then this is probably a sign that it is not a match.
Product name and price are what we currently use to make matches but we also collect product image which could also potentially be used.
If there is any interest from the FastAI community we would love to have a chat. We would be happy to work on hourly rates or on a project basis.