MRNet: Stanford Knee MRI Dataset And Competition

Stanford has released a new dataset competition on Knee MRI scans.
Sounds quite interesting.

The MRNet dataset consists of 1,370 knee MRI exams performed at Stanford University Medical Center. The dataset contains 1,104 (80.6%) abnormal exams, with 319 (23.3%) ACL tears and 508 (37.1%) meniscal tears; labels were obtained through manual extraction from clinical reports. The dataset accompanies the publication of the MRNet work here.


awesome, thanks for posting about this!

I’ve signed up and got the link to the dataset. Is anyone interested in forming a working group to compete in it?
I’d really like to throw of the part2 techniques and framework into this set and see if we can beat out Andrew Ng’s group :slight_smile: That would be great bragging rights lol.
I recently used FastAI on a cellular histopathology dataset and was able to beat out the current papers accuracy, but this dataset is clearly much deeper given the 3d’sh nature of the scans. Their model almost looks like half of a 3 layer UNet as they shared params.
Anyway, I’m interested but would definitely want to collaborate on it. They were weaker on meniscus tears and while it looks like they used the same model to predict all 3 issues, I almost wonder if having three independent models for each type would allow higher accuracy.
btw, here’s the link to their overview page:


@LessW2020 . I am interested in joining you in this effort.

1 Like

Fantastic! Let’s see if we can get a couple other people and then let’s get started.
To get the data set locally, you’ll need to fill out their form and then they’ll email you the link. It’s about 6GB.
I have the link already and will download and browse through it a bit to better understand the basics of the images, how they label and score it, etc tomorrow.
We can start a new thread (or use this one if melonkernel is ok with it).
I can setup a github repro for it.
Let me know if you have some initial ideas on architecture for it after viewing some of the datasets. I’m going to search on arxiv and just see if I can find some papers on similar types of classification and see what they found to work.
Anyway, should be fun and look forward to working with you!

1 Like

I browsed through Arxiv looking for some ideas related to this. After going through a lot of different things, here’s the one I think is the clear winner (just published, was used for MRI brain lesion and achieved top scores). It’s basically multiple ResNet50’s, pretrained on ImageNet (sounds familiar…), but they then downsample and upsample using all the availble images to create a more holistic view for the segmentation. We’d then need to add a classifier on top of the output, but they solve (imo) the shortcomings of how to take a series of images and not lose connecting info for the bigger picture.

Multi-branch Convolutional Neural Network for Multiple Sclerosis Lesion Segmentation

(Submitted on 7 Nov 2018 (v1), last revised 8 Apr 2019 (this version, v4))

In this paper, we present an automated approach for segmenting multiple sclerosis (MS) lesions from multi-modal brain magnetic resonance images. Our method is based on a deep end-to-end 2D convolutional neural network (CNN) for slice-based segmentation of 3D volumetric data. The proposed CNN includes a multi-branch downsampling path, which enables the network to encode information from multiple modalities separately. Multi-scale feature fusion blocks are proposed to combine feature maps from different modalities at different stages of the network. Then, multi-scale feature upsampling blocks are introduced to upsize combined feature maps to leverage information from lesion shape and location. We trained and tested the proposed model using orthogonal plane orientations of each 3D modality to exploit the contextual information in all directions. The proposed pipeline is evaluated on two different datasets: a private dataset including 37 MS patients and a publicly available dataset known as the ISBI 2015 longitudinal MS lesion segmentation challenge dataset, consisting of 14 MS patients. Considering the ISBI challenge, at the time of submission, our method was amongst the top performing solutions. On the private dataset, using the same array of performance metrics as in the ISBI challenge, the proposed approach shows high improvements in MS lesion segmentation compared with other publicly available tools.

Other ideas:

1 - ANT-GAN (abnormal to normal transform GAN) -this is an old paper 2015 that I guess was just released to Arxiv last week or so. Anyway, the concept was pretty neat. Basically the build a GAN that is designed to take MRI images and construct a ‘healthy’ equivalent. In some cases it’s rebuilding an unhealthy MRI into healthy, ,or just replicating healthy.
The created image is then masked onto the actual MRI and any relevant lesions or issues thus are forced to the front. They used a VGG but we could use a ResNet classifier to then eval the relevant difference for a classification.

2 - This paper on randomly wired networks basically uses a network generator to try and generate novel networks. Their argument is most NAS approaches are still using hand crafted limitations, vs this is truly random and then pruned…they are competitive with ResNet and NAS with fewer computations. Negative is this is a pretty long loop around to getting to a classification CNN:

and no way to connect the series innnately.

The authors of the MultiBranch CNN above state they will release the code to anyone who emails them, so I think that’s gives us the best start and I like that it is already using ResNet50, which we have worked a lot with in FastAI…

Thoughts, ideas?


Here’s the Stanford archtitecture:

and here’s the multi-branch architecture…but I just noted that each of these inputs if 3 different types of scanning, which each have coronal/sagitall/axial images…so we actually would only need one layer /row rather than 3, of the top section here:

and related, a recent paper on how to data augment bio images realistically and thus have more training data…jumped the scores of their UNet:


I would also be interested

1 Like

I can also help

1 Like

Our group used a ResNeXt-50 architecture for stenosis grading on MRI lumbar spine studies. Stenosis grading amounts to a classification task where you’re looking to see how narrow certain parts of the spine are. The segmentation bit with the U-net was just to localize the intervertebral discs and generate nicer stacks of images for the classification task.


Seems worthwhile to request their code. Have you emailed them already?

Thanks for getting us organized.
Have you set up the github repo? We can use that to start coordinating around steps to implement and paths to try.

Yes, I emailed them yesterday - no reply yet but will update when I do hear (had “here” lol) back.

1 Like

Setting up the github repro today - wasn’t sure what we should name it?

How about:


1 Like

Sounds good - straightforward and to the point.

Github repro made!

I’ll update the readme in a bit with link to the Stanford page,etc.

Related but in case you are not aware - there’s a hidden bonus if we do well on this challenge…bragging rights.
Specifically - do you recognize this guy?

Andrew Ng is on the Stanford team (and they have the one and only entry so far)…but if we can place above them, that would be pretty cool to say we outranked one of the best known people in AI :slight_smile:

One item though - I got the link to their dataset (they email it to you) and after downloading it 2x now, everytime it says the zip file is invalid. It’s about 6GB in size. I’ve emailed them about it, but can anyone else post if they are able to download and open it? (I would post the link but you have to agree to their Research/non-commercial use to get the link).

1 Like

Wow, that’s fantastic! I actually saw your paper looking for MRI related papers on arxiv for this project.
Hope you will help out here as that would be great to have your experience!

I’m happy to help. First, a little domain knowledge…

Here’s the description of the data (emphasis added):

Examinations were performed with GE scanners (GE Discovery, GE Healthcare, Waukesha, WI) with standard knee MRI coil and a routine non-contrast knee MRI protocol that included the following sequences: coronal T1 weighted, coronal T2 with fat saturation, sagittal proton density (PD) weighted, sagittal T2 with fat saturation, and axial PD weighted with fat saturation. A total of 775 (56.6%) examinations used a 3.0-T magnetic field; the remaining used a 1.5-T magnetic field.

And here are the tasks:

The leaderboard reports the average AUC of the abnormality detection, ACL tear, and Meniscal tear tasks.

In a T1-weighted sequence, fluid will be intermediate-to-dark signal and fat will be bright. Fibrous structures, such as the ACL and meniscus - will be dark on all sequences. T2 and PD with fat saturation are referred to as fluid-sensitive sequences, because fluid stands out as bright signal on these images. Fat is normally bright on T2 and PD, but “fat saturation” saturates that signal - i.e. makes it dark - which makes it easier to see the fluid.

Abnormal fluid is often associated with pathology. There is normally some fluid in the knee joint capsule, but too much fluid often indicates a problem. Inflammation often results in increased fluid in the fat - called fat “stranding” because it looks wispy and strand-like. Menisci and ligaments (like the ACL) should be nice and dark. A focus of bright signal in one of these structures may indicate a tear.

Super-basic knee anatomy:

I would suggest the following sequences for overall abnormality detection: coronal T1, sagittal PD and coronal T2 with fat saturation. The coronal T1 and sagittal PD will give you a nice overview of the anatomy, albeit with slightly lower sensitivity for pathology. The coronal T2 fat-sat provides a good overview screen for pathology due to the relative brightness of fluid.

The highest yield sequences for specifically detecting meniscal and ACL tears will most likely be the coronal T2 with fat-sat, sagittal T2 with fat-sat, and axial PD with fat-sat. Menisci and the ACL are typically better evaluated on the coronal and sagittal sequences, but the axial can provide useful information in certain edge cases. So a 3-channel model architecture might include each of these sequences as one input channel.


One modality-specific note for MRI:

Signal intensity in MR is essentially arbitrary. These cases are all from the same vendors scanners (i.e. GE), so one would hope that the signal would be more consistent from scan-to-scan. However, this is not always the case, particularly since some of the scans were done at different magnetic field strengths (3 vs. 1.5 Tesla).

This recent paper out of Maciej Mazurowski’s group at Duke Univ. demonstrated a system for normalizing MR signal intensity across subjects for specific sequences. Might be worth a read (and attempt at implementation). There are simpler techniques that attempt to normalize signal in MR images, but this approach might be worth the added cost of implementation and computation.


I was able to download and unzip the data. Here’s the tree:

├── train
│ ├── axial
│ ├── coronal
│ └── sagittal
├── train-abnormal.csv
├── train-acl.csv
├── train-meniscus.csv
├── valid
│ ├── axial
│ ├── coronal
│ └── sagittal
├── valid-abnormal.csv
├── valid-acl.csv
└── valid-meniscus.csv