MRNet: Stanford Knee MRI Dataset And Competition

neuradai · May 2, 2019, 7:11pm

Apparently things aren’t quite working. No errors but network not training. It’s a busy clinical week for me, so I prob won’t get to revisit until sometime this weekend.

tamhash · May 2, 2019, 8:03pm

@neuradai Thank you for update.

LessW2020 · May 3, 2019, 6:08am

I didn’t get anywhere in terms of unzipping the dataset (if someone can post it or even a subset link I would appreciate it).
However…I did find something exciting and have tested it out on MNIST today with really great results . Rewriting some of the fastai stuff to use it for ImageNette next - it requires the loss to be passed into the step function in order to compute how to step so it’s not like Adam, etc. and thus doesn’t just plug into the current framework.

This is what I am testing:

and this is why it’s exciting:
"We present experiments on the CIFAR and SNLI data sets, where we demonstrate the significant superiority of our method over Adam, Adagrad, as well as the recently proposed BPGrad and AMSGrad. "

More importantly, in my first testing today it just smashed through a subset of MNIST:

I’ll test in on Imagenette next and see if this continues, but if so, it might be a very nice advantage for us for training.

RogerS49 · May 5, 2019, 8:02am

Not sure what your exact problem is with windows and unzip. In the past I have installed cygwin on windows and used bash terminal to unzip . By the way I am unzipping (command line) the data (ubuntu 18.04 Zip V3.0) and it takes a few 10s of seconds. Also note below there are some warnings during the unzip process but it seems I have matched pierreguillou’s results. It maybe a zip version issue although when I run file MRNet-v1.0.zip it returns at least v1.0 to extract a very early version.
Note :-

Earlier versions of zip/unzip could not handle files larger than 2 or 4GB

The latest sources and executables are here http

Try zipinfo on the archive.

The output shortened from unzip and zip info :-

dl@DL1:~/mrnet-fastai$ unzip MRNet-v1.0.zip
Archive: MRNet-v1.0.zip
warning [MRNet-v1.0.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
creating: MRNet-v1.0/
inflating: MRNet-v1.0/valid-abnormal.csv
creating: MRNet-v1.0/valid/
creating: MRNet-v1.0/valid/axial/
inflating: MRNet-v1.0/valid/axial/1139.npy
inflating: MRNet-v1.0/valid/axial/1138.npy
inflating: MRNet-v1.0/valid/axial/1249.npy
inflating: MRNet-v1.0/valid/axial/1248.npy
inflating: MRNet-v1.0/valid/axial/1207.npy
inflating: MRNet-v1.0/valid/axial/1213.npy
…
inflating: MRNet-v1.0/valid/.DS_Store
creating: __MACOSX/
creating: __MACOSX/MRNet-v1.0/
creating: __MACOSX/MRNet-v1.0/valid/
inflating: __MACOSX/MRNet-v1.0/valid/._.DS_Store
creating: MRNet-v1.0/valid/coronal/
…

inflating: MRNet-v1.0/train/axial/.DS_Store
creating: __MACOSX/MRNet-v1.0/train/
creating: __MACOSX/MRNet-v1.0/train/axial/
inflating: __MACOSX/MRNet-v1.0/train/axial/._.DS_Store
inflating: MRNet-v1.0/train/axial/0593.npy
…
inflating: MRNet-v1.0/train/sagittal/0216.npy
file #2547: bad zipfile offset (local header sig): 1353202
(attempting to re-compensate)
inflating: MRNet-v1.0/train/sagittal/1108.npy
…
dl@DL1:~/mrnet-fastai$ info zipinfo
dl@DL1:~/mrnet-fastai$ zipinfo -h MRNet-v1.0.zip
Archive: MRNet-v1.0.zip
Zip file size: 6087523606 bytes, number of entries: 3784
warning [MRNet-v1.0.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)

dl@DL1:~/mrnet-fastai$ du -h MRNet-v1.0
2.2G MRNet-v1.0/train/sagittal
2.1G MRNet-v1.0/train/coronal
2.4G MRNet-v1.0/train/axial
6.6G MRNet-v1.0/train
230M MRNet-v1.0/valid/sagittal
222M MRNet-v1.0/valid/coronal
258M MRNet-v1.0/valid/axial
709M MRNet-v1.0/valid
7.3G MRNet-v1.0

The count of files inflated in MRNet-v1.0 :-

ls -lR MRNet-v1.0 | wc -l == 3790

ls -laR __MAXOSX | wc -l == 46 mainly empty directories in .zip file directory

zipinfo with the -l option lists in long Unix format gives :-

zipinfo -l MRNet-v1.0 | wc -l
warning [MRNet-v1.0.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
3789

Don’t understand the 3790, 3789 difference perhaps not an issue!!

Please be more specific about your problem if any of this does not enlighten you. Cheers

Oh I am from across the pond (GMT/UTC/BST depending on time of year ) so sleep and play (as I don’t work) are near opposite to yourselves.

nswitanek · May 10, 2019, 4:53pm

Have implemented a Case-centered custom ItemList for MRNet data. See PR in github.

melonkernel · May 16, 2019, 3:13am

Great work everyone!

nswitanek · May 17, 2019, 3:50pm

Hey, Less, sorry for missing your request for the data. Do you have the dataset or are you still in need?

nswitanek · May 17, 2019, 4:06pm

I have posted to github two notebooks:

save_middle_slices_as_images.ipynb
MRNet_Baseline_Models.ipynb

The first is a batch preprocessing step to take the .npy array files and save them as three-channel images in separate directories using either just the middle slice, or else three slices centered on and including the middle slice, possibly skipping one or two slices.

The “baseline models” notebook is a scaffold for fitting a set of models on the images from each plane prepared in the “save middle slices” notebook.

Baseline models can be extended in many ways, including the following:

Unfreeze the weights and retrain
Adjust the learning rate, add scheduling
Use a version of resnets or other vision model
Add data augmentation
Add layers/model to aggregate predictions from separate planes (paper uses logistic regression)
Add code to process the model metrics files so we can keep track of performance improvements

Let me know if you’ve gotten the notebooks running.
Let everyone know if you’re attacking one of the steps above, so we avoid duplication.

nswitanek · May 17, 2019, 4:23pm

For reference, average AUC across tasks with a basic setup is about 0.78 on the validation set, whereas the competition baseline model got 0.92. Lots of room for improvement…

Also, the “dilations” idea of taking three slices from each scan results in blurry RGB images. As expected, there’s no benefit from the additional information (from 3 instead of 1 slice) without unfreezing the weights.

Basic setup:
using pretrained AlexNet
with LR defaults
trained for a small number of epochs (10)
on a single slice or three slices from the center of each scan from each plane
and using only the plane model with highest performance for each target label

LessW2020 · May 18, 2019, 6:52pm

Unfortunately I am still in need. I’ve spent an hour+ on Windows, and then I tried to get it over to FloydHub with no luck either…if it’s possible for you to just even post a link to a small subset that would let me start looking at things and get going.
Thanks very much!

melonkernel · May 20, 2019, 3:50am

I just tested to download and unzip the files and it seemed to work:
What i did was this:

Start my training server instance
Open jupyter notebook (this is not necessary if you can just ssh to it)
Start a terminal
Register on the MRNet page to get the download url
go to your data folder
run wget http://download.cs.stanford.edu/deep/the-real-filename.zip
run unzip the-real-filename.zip

You could run these commands via a notebook if you wish, but i like to use the terminal.

LessW2020 · May 20, 2019, 4:47am

Worked like a champ! Thanks for the explicit instructions, exactly what was needed.
Now that I’ve got actual data can start moving forward with it.
Thanks again Christoffer!

ingbiodanielh · May 21, 2019, 5:56pm

Hi here it’s a notebook in colab to get started with the competition with fastai !
Hope it helps
https://colab.research.google.com/drive/10GoEbF6FuKtVibXHFc7UjwwdHzX5_pvE

melonkernel · May 24, 2019, 12:09pm

I wonder if it could be a good idea to generate data augmentations by stacking the volume images and rotating the volume by random degrees and then re-generating new slices from that volume?
kind of like this https://youtu.be/f4IPsdTn7c8?t=285 but with code and then generating slices.

parth · September 9, 2020, 5:45am

I have taken this competition as a part of my Master’s thesis. I have tested various models.
Please let me know if anyone is interested in working with my code

parth · September 9, 2020, 5:47am

Thats a pretty great idea. I also watched the video.
I implemented it and imported a sample file from the dataset to photoshop. The main problem I got was that the image is not of the good quality and it is very tough to interpret.

surbhisingh · November 23, 2020, 5:34am

Hi! I am a data science student at UVA trying to work with this knee data set as a basis for other medical applications. I would love to take a look at your code if possible.

Moahmed29 · December 6, 2024, 1:10pm

did you get anyone , i did my EDA , but needed more help , could you help ?

Moahmed29 · December 6, 2024, 2:00pm

hello sir , could you help me on this dataset ?