MRNet: Stanford Knee MRI Dataset And Competition

nswitanek · April 16, 2019, 8:07pm

Thank you for sharing the domain knowledge.

How is “sequence” different from image?

nswitanek · April 16, 2019, 8:10pm

Which, if any, of the data augmentation techniques discussed by Jeremy do you expect would be valid, or else inappropriate, for this use case?

neuradai · April 16, 2019, 8:13pm

In MR lingo, a pulse sequence is how one acquires the images. T1-, T2- or PD-weighted describes the timing of the pulses in the sequence and the effect on the signal at the time of image data acquisition. The data is then reconstructed from the time-frequency domain (K-space) into the image domain via some variant of an inverse fast Fourier transform.

TL;DR A “sequence” often simply describes the manner of acquiring a single stack of images.

Typical data augmentation techniques appear to be effective for medical image data as they are for photographs. However, I’ve never applied perspective warping to medical image data and would love to test that to see how well it works.

nswitanek · April 16, 2019, 8:15pm

Abnormal fluid is often associated with pathology. There is normally some fluid in the knee joint capsule, but too much fluid often indicates a problem.

This suggests that we may need a way of detecting extent of inflammation relative to the capsule extent, rather than mere presence. Is this correct? Are combinations of convolution filters as in ResNet adequate to that task? If not, are there known effective approaches to the problem?

If it’s easier to point me to papers, feel free

neuradai · April 16, 2019, 8:18pm

In our paper on DeepSPINE, we were looking at a fluid filled sac (the thecal or dural sac, in which the spinal cord and spinal nerves live) and more-or-less trying to train the network to determine it’s relative size, which is (with some hand-waving) how stenosis grading works. So, I suspect special techniques won’t be necessary, that the degree of joint fluid will be convolved into some feature(s) the network evaluates.

Note: you won’t see my name on the DeepSPINE paper, because I joined the project after the technical manuscript had already been submitted, so I use “we” a little liberally when I refer to this work.

nswitanek · April 16, 2019, 8:25pm

Interesting. Thank you very much for the education.

One of the first things we’ll want to do is some data exploration. Might you have jupyter notebooks you could share with the team that we could use to accelerate our exploration of the MRNet data? I expect many of the tasks you and your coauthors did could be useful to our getting oriented with the data as well.

neuradai · April 16, 2019, 8:36pm

What is the format of the image files? I haven’t requested access to the data yet.

nswitanek · April 16, 2019, 8:37pm

File extension is .npy

nswitanek · April 16, 2019, 8:38pm

So, NumPy arrays I guess. Haven’t examined them yet…

agentili · April 16, 2019, 9:57pm

I am also interested and can contribute domain expertise as a musculoskeletal radiologist.

neuradai · April 16, 2019, 10:21pm

I provide domain knowledge, for my “expertise” is in Neuroradiology. @agentili provides domain expertise. As such, feel free to correct me if I’ve gone a-stray…or simply add your preferences, if you find certain sequences to be more useful for certain tasks.

neuradai · April 16, 2019, 10:27pm

NumPy arrays, I guess.

We usually work with *.dcm files (DICOM format) and use a DICOM viewer - such as Osirix or RadiAnt - for data exploration.

I can try to put a Jupyter nb together over the next day or so, when I get a little free time.

nswitanek · April 16, 2019, 11:16pm

Perhaps just extracting a to-do list from your DICOM workflow would be useful to us, so we know from your domain expertise the critical things to check.

Here’s hoping that the fresh if naive eyes of those of us without domain expertise proves useful as well…

LessW2020 · April 17, 2019, 1:27am

That’s great to know - was this on Windows? Pranav did email back but just recommended trying on a non-windows machine so maybe I’ll try it on Floydhub.

LessW2020 · April 17, 2019, 2:07am

Great to see the activity here! I spent a few more hours this afternoon researching and wanted to throw out a few more ideas concepts and hopefully we can get going after that.

Possible additional ideas re: architecture:
1 - I saw two papers using Super resolution on MRI images in order to get better classification results. (2x and 4x image resolution enhancements). I believe we will have an updated FastAI super-resolution portion in one of the next classes here, so that might be an interesting pre-processing step to increase our accuracy. (@nswitanek - since you have the images, are they clear or would super resolution be worth investigating?)

2 - There really is not much in terms of knee MRI and deep learning (vs tons for brain MRI and deep learning). I mostly found Stanfords paper from their project entry and one from earlier using a sort of UNet. I did want to show one image from that paper so we have some initial examples of what we are trying to find:

(a) cartilage softening on the lateral tibial plateau, (b) cartilage fissuring on the medial femoral condyle, (c) focal cartilage defect on the medial femoral condyle, and (d) diffuse cartilage thinning on the lateral femoral condyle that were correctly identified by the cartilage lesion detection system (arrow).

and their architecture (it scored well, based on VGG surprisingly) - this was just for cartilage though, not ACL or abnormality:

and link:

3 - On the good news - one paper tested out using a CNN pretrained on ImageNet and then retraining with a small set of MRI images…vs a CNN fully trained on only MRI. They found the ImageNet one outperformed the dedicated MRI trained one, so that’s great for us since we’ve all used transfer learning.

4 - Segmentation first, or direct classification? Several papers used segmentation and then classification… not sure what is better here.

Depending on architecture selection, we could end up with multiple teams/projects b/c in most specialized version we would have:
Super resolution of imagery-> Data augmentation(?) → Segmentation → Classification

Or we just have multiple classification systems leveraging same images but using different priorities ala @neuradai 's excellent proposal:

neuradai:

I would suggest the following sequences for overall abnormality detection: coronal T1, sagittal PD and coronal T2 with fat saturation. The coronal T1 and sagittal PD will give you a nice overview of the anatomy, albeit with slightly lower sensitivity for pathology. The coronal T2 fat-sat provides a good overview screen for pathology due to the relative brightness of fluid.

The highest yield sequences for specifically detecting meniscal and ACL tears will most likely be the coronal T2 with fat-sat, sagittal T2 with fat-sat, and axial PD with fat-sat. Menisci and the ACL are typically better evaluated on the coronal and sagittal sequences, but the axial can provide useful information in certain edge cases. So a 3-channel model architecture might include each of these sequences as one input channel.

I believe tomorrow’s class or next week’s, we’ll be building ResNet from scratch with the latest/greatest FastAI 1.2 so that might give us a first chance to make a small dataset and try out the 3 channel model pretty quickly?

btw - these boards keep blocking me from replying more than 2 or 3 times, but did want to say thanks a bunch to @neuradai for the domain knowledge posts (very helpful!) and also thanks to @melonkernel for posting about this competition and starting this project, and @rsrivastava and @tcapelle for joining in…and very excited we have a radiologist interested in helping - thanks to @agentili !

I really hope we can beat out the Stanford team and make some waves for FastAI in doing so

Oh and since I had to double check some of these medical terms, here’s the layout of what sagitall (or as I would term it, side view), coronal (front view) and axial or transverse (top down) are:

rsrivastava · April 17, 2019, 2:39am

Here is a paper I found on Knee Analysis Using CNN https://arxiv.org/pdf/1703.09856.pdf . But this paper is using X-Ray Vs MRI.

So we have a few questions.

How to handle MRI images.
How does knee MRI differ from brain MRI
Do we need to use segmentation to determine boundaries to various tissues in the knee data.
How to handle 3 D MRI data.

rsrivastava · April 17, 2019, 2:46am

Really nice. Hope you get your guidance.

rsrivastava · April 17, 2019, 2:51am

To know about knee
MRI

This picture is very helpful

LessW2020 · April 17, 2019, 6:12am

I just found this paper on LiSHT, a possibly better activation than ReLU, Swish, etc. from January. It tested out better on a number of datasets and they used ResNet for it…so I’ll do some more testing in our class notebooks with it but this might give us another edge here.

neuradai · April 17, 2019, 2:20pm

Got a chance to download and briefly review the data this morning. Here’s a screenshot from the EDA nb I’m putting together with the code used to generate the plot below.

from ipywidgets import interactive
from IPython.display import display

plt.style.use('grayscale')

class KneePlot():
    def __init__(self, x, figsize=(10, 10)):
        self.x = x
        self.slice_range = (0, self.x.shape[0] - 1)
        self.resize(figsize)
    
    def _plot_slice(self, im_slice):
        fig, ax = plt.subplots(1, 1, figsize=self.figsize)
        ax.imshow(self.x[im_slice, :, :])
        plt.show()

    def resize(self, figsize):
        self.figsize = figsize
        self.interactive_plot = interactive(self._plot_slice, im_slice=self.slice_range)
        self.output = self.interactive_plot.children[-1]
        self.output.layout.height = '{}px'.format(60 * self.figsize[1])

    def show(self):
        display(self.interactive_plot)

With this code, you can generate the interactive plot I have shown, so you can scroll through the images and get a sense of what we’re looking at.

When I have a little more time, I’ll send a link to the full EDA nb.