IBM open-sourced a framework to detect and mitigate bias in machine learning: https://github.com/IBM/AIF360 However, I don’t completely agree with their approaches to mitigate biases, as it seems to be a lot of manual work, and some work against the idea of machine learning.
I would expect that carefully choosing the right optimization target (loss function) to be helpful, as it allows to penalize undesired biases next to your actual target function. If you want a gender-neural model, you could aim for it, then let the optimizer do its job. In the Amazon hiring case, one could aim for a model that is good at predicting the hiring decision while at the same time being bad at predicting the gender, to prevent the model to learn features that are gender specific. I haven’t faced this problem, but I would be curious if somebody has tried such an approach.