I want to share a few more references, based on questions that came up in class last time.
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments (Chouldechova, 2016): This is the paper Narayanan mentioned, which shows the incompatibility of the two notions of fairness.
This is a key chart illustrating the definition (described in the paper as “test fairness”) that COMPAS uses. Here, for each score (1-10) assigned, people are similarly likely of being arrested again. That is, someone with a score of 1, only has about a ~20% rate of recidivism.
When the underlying distributions for different populations are different (e.g. Black people are more likely to be arrested), using the test fairness definition will result in different False Positive rates. Here is the distribution/frequency with which white and Black people are assigned each score:
However, we shouldn’t see this incompatibility as meaning that all definitions of fairness are equally appropriate.
Two other papers that may be helpful.
Fairness Definitions Explained (Verma and Rubin, 2018): evaluates 20 different definitions of fairness all on the same dataset/case study, which is interesting.
A Survey on Bias and Fairness in Machine Learning (Mehrabi, et al 2019): includes a table linking different problem types to relevant references: