First of all I would like to say thank you a lot to Jeremy and all fastai contributors for all the great resources that you have created and gathered here. I’ve been working in ML projects for some time now and I want to open to discussion with all of you an issue that I have frequently see when starting a new project.
I’ve been recently more involved on the early stages of ML projects (specifically for ML application on industry manufacturing processes) and I find many companies on early stages of digitization with a few or non data at all, so domain expert knowledge gathering and assumptions definition are the key to start acquiring the correct data and solve the problem. Here the problem is not to find the most relevant features on a vast dataset or deal with them on many ways, but to decide what data is important to start measuring. I’ve resorted to DMAIC (six sigma) methodologies to help me gather that domain expert knowledge, and I put great effort in research previous works on similar problems, but it is always hard to ensure the viability of each project and the key features to measure. I feel like I might be missing something, some great resources dealing with this previous stage of ML application, because it seems to me that many times it is the main obstacle to apply ML on industry due to the importance of the Return Of the Investment to unlock the projects.
Thank you very much in advance for your time and for any useful information you may share.