Deep learning with medical images


I was listening to @jeremy on TWIML last week, and at the end of the talk he mentions the lack of publicly available medical imaging datasets. Applying deep learning to medical images is my research area, so I am intimately aware of the problem and thought I could contribute what I’ve learned about the practice back to the community.

I created a blog post with my thoughts on how to get started with using deep learning on medical images, specifically magnetic resonance (MR) and computed tomography (CT) images. I overview the two imaging modalities, suggest several publicly available datasets, discuss some techniques for data wrangling and preprocessing (with example scripts), and finally build a small 3D deep learning model using the fastai API.

It turned out a bit longer than expected, and while there is a lot more information to cover, this should (hopefully) help people get started with applying deep learning to structural MR and CT images. I know there has been some previous discussion on here (see here, here, and here for some previous discussion). But I’d be happy to answer any questions regarding the blog post or more general questions regarding working with medical images.

Just wanted to say thanks to everyone who has helped build the fastai package, it’s awesome!


Hi there! Good write-up :slight_smile:

I am one of the developers of NiftyNet, which you mention in the post: I would just like to add that we’ve put out a demo of using NiftyNet image readers/writers within PyTorch.

It shows how to get medical data into the PyTorch context and also how to output results in the correct format. As you mentioned, these operations are different in medical imaging compared to the normal computer-vision based approach, and we think our library makes this part of it much easier.

I’m also keen to talk more generally about medical imaging and will be following this thread eagerly.

(Angel Isaac Antonio Brisco) #3

Hi I’m starting a project for my tesis about pancreas segmentation for pancreatic cancer diagnosis. I’ve been reading some articles about it, but as you say there are few databases for practicing and fewer articles that descrive with detail their neuralnet architecture.
Thanks for the blog posts and for the effort of including examples.
I have a question, how do you deal with the absence of context in the 2d slice of CT VS a 3d analysis?


Sorry for the late reply, I’ve been away for the last week.

The fundamental problem with using only 2D slices is that you are throwing away the 3D context. The optimal fix to this problem would be to use the entire 3D volumes instead of slices. If you are stuck using 2D slices (and DNNs) for whatever reason, then a naive solution would be to post-process the resulting segmentation volume using basic image processing techniques (e.g., morphological filters). There are better—but more complicated—methods to address this problem, and I’d take a look at some academic journals/arxiv for those methods.


FWIW, I believe that medical imaging applications using deep neural networks are moving more and more from 2D CNNs to 3D CNNs. Anecdotally, this appears to be the case (from looking at papers coming out at the relevant conferences), although I have no statistics to back this up. However, for a variety of reasons (e.g., computational limitations, limited dataset size), you may prefer to use a 2D network. Interestingly, you may also notice better performance with a 2D network than a 3D network trained on the same data. Choosing a 2D or 3D model falls within the realm of hyperparameter optimization and will be task/dataset-specific.

The main problem with applying 2D methods to 3D data is that you generally want to reconstruct a 3D result from your 2D outputs. This can lead to inconsistency from slice-to-slice. That is, if you scroll through your reconstructed 3D result, the segmentations may not align in sensible ways from slice-to-slice (which is why I previously suggested the use of morphological filters to smooth the segmentation). Note that this will also occur when using 3D patches (e.g., a training/testing on 64x64x64 patches from a 256x256x256 voxel volume). This problem occurs due to each patch/slice being predicted independently of one another.

As speculation, my initial impression for why 2D models can sometimes perform better than 3D models would be: 1) inadequate/suboptimal data augmentation and 2) differences in training and testing data that are not as problematic in 2D setups. The two of these are connected, but to expand on the second: consider two sets of MR images that are acquired at 1x1x1mm^3 and 1x1x3mm^3. You train a 3D CNN on the first set and test on the second. Even if you interpolate the second to 1x1x1mm^3, I’d guess that your performance metric (e.g., Dice) will often come out worse than if you trained and tested a 2D CNN on the common 1x1mm^3 slices. Take that example with a grain of salt, but hopefully that provides some intuition on why 2D can outperform a 3D network.

If people have other thoughts, I’d be very interested in hearing them

(Angel Isaac Antonio Brisco) #6

Thanks for your answer, now at least I’m more aware about the problems I’m going to face with each method. I think that I will try a 3d CNN first and if I have a memory error I will migrate to 2D.


Thanks for interesting contributions in this topic. I work in a big healthcare company, and in my free time I’m a machine learning enthusiat.
I would like to propose to my company to introduce a machine learning system in MR and CT images to cooperate with human radiologist to identify cancers.
I can convince them to make this project if I can bring some studies and statistics to demonstrate how good is ML to detect cancer, compared to human performance. Is there any document that compare ML systems and human radiologyst to identify cancer?
If I can start this project, I will be very glad to collaborate with others using libraries.

(George Zhang) #8

Hey guys, it is such a pleasure to see other fellow fastai students who are passionate about applying deep learning techniques to healthcare problems.

Below is a post I originally created in the new 2019 Part 2 forum, which has its access limited to only the cohort taking the class. Now I find it a mistake and decided to move it out to reach wider audience.

(George Zhang) #9

Hey @RomRoc, in the thread I reference above, there are a few papers that you might be interested.

(Angel Isaac Antonio Brisco) #10

Hi, have developed 2 u-net for pancreatic segmentation in a CT scan. Now I have one that use a png from the slice and another that use the information directly from the dicom file. My goal is to make a 3d one and compare there accuracy and application for diagnosis of pancreatic cancer. In this days I will share with you the notebook with some tricks I learn to make it work in windows.
I’m very happy to see such a community working in AI and medicine.


Thanks a lot @PegasusWithoutWinds I just started watching that thread to receive notifications, it’s very interesting.
Unfortunately I didn’t find any comparison between human and ML detection performance on medical images.
It would be important to make some slides with my proposal. Unfortunately people out of ML fields need results to understand great benefits we can obtain.

(George Zhang) #12

Check out stanford MURA competition. Also Stanford’s work on skin cancer

(Angel Isaac Antonio Brisco) #13

Hi, I have read and article of a neuralnet that diagnose pancreatic cancer in 20% more early cases than a human. I would like to help you with your investigation. Also with some practical work, I have learn that practice is a very good why to learn new things