That breast article is a great example dealing with high res images. They had the opportunity to work with a huge dataset to support their good results. Li Shen also published a patch based approach in 2017 (https://arxiv.org/pdf/1708.09427.pdf) on a smaller dataset from the Dream Mammography Competition.
When the patch labels are relatively balanced, I think this approach is interesting. But it is not quite the situation with mammographic data which is highly unbalanced (pathology vs no pathology) in the pixel scope.
But my own “2019” preference and intuition how to better solve this problem with sparse spatial pathology is to use a one stage detector approach with a feature pyramid network and focal loss (that is basically the definition of retinanet). The main issue is related to the labeling cost of detection (bounding box) compared to classification. I don`t think it will be that big an issue on the long term.
Now time for a bold statement. To be honest, I think the retinanet architecture (2D, multi-channels 2D->3D, pure 3D), with proper labels, solves the vast majority of the diagnostic problems in radiology.