Being a radiologist, I mostly work on medical images. There are tons of interesting challenges working with these. But the potential impact is huge. We are still in the very early days of application of deep learning to medical imaging.
Globally, high resolution, single channel, noise, weak labeling and/or low number of samples are the frequent challenges when applying deep learning to medical imaging. What is really exciting for the algorithmic dl research in medical imaging is that each problem has usually a different optimal solution.
To answer your specific question about resizing, it depends on the context. If the computer vision problem that you are trying to solve doesn’t really need high resolution, then it should be fine to resize. But if you try to solve a problem that usually needs high resolution (eg. identifying microcalcifications on a digital mammography) than resizing can completely broke the potential performance. Involvement from an interested domain expert (radiologist) can usually help to get a hint for a useful direction.
I totally agree with @jeremy. And, most of the time, if a human expert can’t classify well with low resolution and your model can, suspect an unrelated bias.
Here is a well known example, with a great machine learning methodology, using 224x224 training for chest xrays, of potential unrelated bias in the dataset that helped to make the predictions : https://arxiv.org/pdf/1711.05225.pdf
Of course, I meant implicitely, that if I try to interpret chest X rays on 224x224 images, I’ll pass more time in the lawyer office than in my office.
That breast article is a great example dealing with high res images. They had the opportunity to work with a huge dataset to support their good results. Li Shen also published a patch based approach in 2017 (https://arxiv.org/pdf/1708.09427.pdf) on a smaller dataset from the Dream Mammography Competition.
When the patch labels are relatively balanced, I think this approach is interesting. But it is not quite the situation with mammographic data which is highly unbalanced (pathology vs no pathology) in the pixel scope.
But my own “2019” preference and intuition how to better solve this problem with sparse spatial pathology is to use a one stage detector approach with a feature pyramid network and focal loss (that is basically the definition of retinanet). The main issue is related to the labeling cost of detection (bounding box) compared to classification. I don`t think it will be that big an issue on the long term.
Now time for a bold statement. To be honest, I think the retinanet architecture (2D, multi-channels 2D->3D, pure 3D), with proper labels, solves the vast majority of the diagnostic problems in radiology.
Yes I remember your love of centroids ! I understand the intention to speed up the labeling with centroids. But, for a radiologist, drawing a bounding box or a centroid takes almost the same time to label if the problem accepts some uncertainty on the boundary precision.
Centroid makes sense for pulmonary micronodules (chest xray or ct) since they are naturally of spheric morphology. But many pathological processes are not spheric at all. Labeling a centroid on a fracture, bowel occlusion or a complex brain bleeding doesn`t make much sense spatially.
In my opinion, defining the boundaries of a sparse pathological process definitely adds to the semantic labeling content used by a dl architecture (retinanet) designed to learn from that information.
You remember the whale kaggle competition with its extremely unbalanced dataset… Metric learning (Siamese et. al.) was the solution of choice… Do you think that metric learning could be a better choice for a patch based approach like the breast article… Instead of using CNN classifier, why don’t we use Siamese/Triplet NN or even some form of metric learning that lies in between one-shot learning and classification for identifying the patches? Like few-shot learning protonets…
I was extremely excited for the Triplet networks and the protonets that I have learned and used in that competition… I can imagine a lot of use cases where there are unbalanced classes especially in medical images…
Yes, metric learning is definitely a valid hypothesis Haider to improve classification specificity on patches.
For the whale competition, metric learning was probably working very well because there was a low amount of labels per whale and a very high number of labels. And each whale was exactly defined by the same visual appearance with different projections and lighting conditions. This is a similar problem with face recognition.
With pathological entities in medical imaging, beyond different affine transformations or “lighting” conditions, the visual appearance is not exactly the same even if a diagnosis (class) is the same. And you don`t necessarily have thousands of different classes for a specific modality. So, in my opinion, metric learning can still work and improve over a more classic patch classification model but probably not as well as the case of the whale competition.
Perhaps ensembling both metric and classic classification approach with a meta-learner would be the best… The 2 approaches are different and their misclassification are most likely weakly correlated… of course it is only my speculation… Jeremy taught us that to get an answer for whether blah is working for blah, the best answer is to try blah
There are countless times my intuition says something should work well and when I try it, I find it doesn’t work…
Eager to try this on the next medical imaging competition