I find [this work] (http://safepaths.mit.edu/) being done at MIT promising to do contact tracing while protecting privacy. The problem is any new solution relying on network effect needs widespread adoption to be useful which is tricky to do at the speed we need currently.
Another real-life example from my work: we are building models for the insurance world. When studying features to put into our model we had available features like the veteran state, or whether somebody was married or not. These features had a high correlation with the target. We decided to remove those (and everything else we felt unhetical) even though this very likely harmed our model in terms of its accuracy.
The interesting point about all these things is that it’s up to you, the person doing the analysis and your team, to do the right thing. Nobody is there watching you and in many cases you could easily get away with it. So in many cases it’s really a personal effort of self discipline and the desire to do good, even though there are things you can put in place that can help you systemically in your workplace.
So I really appreaciate us talking about this.
What are some of the problems introduced by automatic candidate ranking via online interviews where you record an interview, and then the video and audio gets analyzed by ML models?
There was an interesting Allen Institute paper about using GAN to debias image data. To train the model, it required labels on the exact characteristics it was trying to avoid biasing against. In general, is it legal/feasible/dangerous to use these labels for training a more ethical model?
The paper (Wang, Zhao, et al 2018/9): https://arxiv.org/pdf/1811.08489.pdf
Mostly reactive. Though some like Salesforce and Google do have dedicated Ethical AI teams that are trying to move towards proactive interventions.
Regarding contact tracing, I think there’s a lot of fears that the tools and datasets built to track people’s locations will remain in use even after covid-19 is over.
Do you think the diversity of the team building a tool/model can predict the efficacy of the product in terms of fairness
I’m absolutely one of those people – look at the Patriot Act in the USA.
This Wisconsin Sentencing software is HORRIFYING.
Is that using traditional business rules, or an actual ML model?
Aren’t the AI biases transferred from real life biases?
(Eg. Colored people being treated differently is an everyday phenomenon! Women too!)
Rachel’s recommended video: 21 fairness definitions and their politics by Arvind Narayanan.
I’m not even sure we know, since it’s proprietary… One step for ethics is openness, and it starts with software.
Interesting. While removing the info did you think about proxies that may capture the same information. For example in US, zip code is a very good proxy for race. Lot of people remove race but the model is still biased given proxy variables while giving a false sense of security.
There is a lot of work on privacy-preserving contact tracing applications, to the point that with so many people working on the same problem, there are plenty of opportunities for comparing designs.
Interesting federations working on this:
- the TCN coalition (including the Zcash foundation),
- the MIT’s SafePaths project,
- the Pan-European Privacy-Preserving Proximity Tracing organization, and Cory Doctorow’s writeup.
I disagree with the Wisconsin Supreme Court. This should be illegal. Use of AI models in criminal justice is deeply disturbing.
Are there good resources you’d suggest to dig further into technology use by government (with regards to ethics, compelling legislative/judicial bodies to do the right thing, etc)?
What if by allowing them to track you through your phone you are allowed to go back to normal life? Let’s say they allow you to gather with other people that have not been diagnosed or have been cured? I’ve heard about this idea, and although I can think of hundred things that are wrong with this approach, part of me feels that this could be an interest idea (if done correctly).
We tried, but to be honest it gets more and more difficult exactly for the reason you say. There might be things embedded in your features that you don’t fully understand.
However, you can spot certain things if you do feature importance after the modeling.
This seems like an intensely deep topic, needing specialised expertise to avoid getting it wrong. If you were building a ML product, would you approach an academic institution for consultation on this? Do you see a data product development triad becoming maybe a quartet, involving an ethics or data privacy expert?
We may not know what biases the computer can go towards. Since we saw data for bias based on race, gender. But if machines have bias over saw black eye over brown eye we might not even know. Is there any way to generally solve these issue than looking at biases one by one…