Ethics of selling a classifier trained on open data

machinethink · June 18, 2018, 9:27am

I think the question that should be answered here is: is a machine learning model that is trained on a particular dataset a derivative work of that dataset? If yes, then the rights holders of the dataset get to determine whether you can or cannot sell such a classifier.

And if they’re not considered derivative works, could a license on a dataset legally prevent you from making machine learning models using that dataset?

I recently read an article that people are starting to test the legalities of these kinds of things. It seems like the lawyers have discovered deep learning too.

(BTW, I’m currently selling a specific implementation of machine learning models (using Metal on iOS), which include pre-trained weights made from ImageNet. But in my case the thing I’m selling is not the trained weights but the code that is needed to run them.)