Sharing a couple of links I ought to have remembered earlier, but brains are hard:
Diversity in AI is not your problem, it’s hers by Robert Munro (article): A really in-depth example of gender bias in a dataset that gets extremely specific as to which vectors the bias did and didn’t enter on.
Corpora as medium: on the work of curating a poetic textual dataset by Everest Pipkin (video, 23 minutes): Not precisely germane to the themes of the class, but I found this to be a really touching presentation about the importance of intimately selecting and engaging with your data instead of just grabbing whatever you can.
Also, I mentioned one example of an interesting compound bias to Rachel after class and she encouraged me to share it with everyone. The healthcare industry in the US is gradually shifting from fee for service (where insurance reimburses for each individual service rendered) to value-based care (where insurance pays a flat yearly amount for health systems to keep people healthy). There are endless complications on top of it, but the base movement is a good thing, aligning health systems incentives with the patient’s (since preventative care is both cheaper and better for the patient’s health).
However, a one-size fits all reimbursement can be a problem when some places just have sicker patients than others. This leads most places to do risk adjustment payments. How do you adjust for risk? You use a risk score like HCC, which takes patient diagnoses as inputs. Now, in fee-for-service land, diagnoses are mostly only needed to justify procedures - you code a broken arm diagnosis to justify the surgery. But there are a lot of diagnoses that don’t often lead directly to procedures, and that a doctor has a large amount of latitude when deciding to discretely code or not - depression, anxiety, things like that.
The key here is that you don’t have a financial incentive to document depression in a fee for service system, but you do when you switch to value-based reimbursement, because it will raise the patients risk score. So when people are studying the “increased incident of depression” in the US over time, they might actually just be studying the improved coding of depression as it becomes profitable for increasing segments of the country to write it down. This is both a observation bias and a historical bias, and one you can’t control for cleanly, since different health systems are going to value-based at different times. It’s a really stark example of how some things will *never be detectable without a ground level knowledge of the data that’s actually going in - algorithms alone will never save you.