Deep dive in to lung cancer diagnosis

(Surya K) #82

thanks for this wonderful intro to the problem and the domain!


(Jeremy Howard (Admin)) #83

@alexandrecc this is such great information - is there any chance you might consider copying it into a medium post? If not, do you mind if I turn it into a post later?

1 Like

(Krishna Vishal V ) #84

@jeremy Count me in, I’m very interested.


(Alexandre Cadrin-Chênevert) #85

Yes sure, let me find some images to improve the format and I’ll let you know when I post it on medium. I’ll try to do this before leaving for the RSNA on saturday.


(Rikiya Yamashita) #86

Hi @alexandrecc,

Nice intro from radiologists perspective!
BTW, I’m coming to RSNA too. It would be very nice to meet up with you and say hello in person. If you don’t mind, please let me know :wink: Let’s enjoy RSNA 2017 :smile:


(Alexandre Cadrin-Chênevert) #87

Blog post inspired by previous forum post available here:


Deep learning with medical images
(Octavio ) #88

Very good post! I’m very interested and half way through the processing steps!


(segovia) #89

I cloned the grt123 team’s repo and finished the preprocessing. Still trying to understand the details of each piece of code. Kinda overwhelmed by the complexity of their solution…

Learning Pytorch in the meantime.


(Alexandre Cadrin-Chênevert) #90

I recently read more about this collaborative project that looks promising if someone from wants to get involved:

The project could have very high impact if it works as planned. @kmader posted the link to the public github repository earlier in this thread on sept 16 : Deep dive in to lung cancer diagnosis

They are also using grt123 solution.


(Kerem Turgutlu) #91

I have a question to experts in this field. I am currently walking through Brad Kenstler’s notebook. The images I am working on right now has modality : MR. I’ve tried to look online but couldn’t find a good explanation for mapping MR intensities -> Hounsefield Units for visualization purposes. I would appreciate very much if someone can help me out. Maybe I am searching for the wrong thing. Thanks !

And I suspect the interpretation of intensities of CT and MR are probably different which makes it hard to skip visualization part of the notebook and continue with the actual preprocessing. Especially designating ROI thresholds probably differ by different organs (lung to brain) and different machines (CT to MR).


(Jeremy Howard (Admin)) #92

If you ask on twitter and at-mention me I’ll ask the rad community if they can help. cc @Judywawira @alexandrecc @davecg


(David Gutman) #93

Hounsfield is just CT.

Depending on the MRI sequence it might have intrinsic meaning (eg ADC, quantitative flow sequences, some perfusion metrics), but even those will vary from scanner to scanner.

Demeaning and dividing by std for the volume should be a reasonable way to start, but you should check to make sure even/odd slices aren’t very different (MRI series are sometimes collected “interleaved” and on some scans you will notice alternating intensity levels). Normalizing by slice might avoid this problem, but slices that are nearly empty will be normalized very differently than slices with a lot of tissue (you can see this when viewing images on many PACS systems).

You also usually need bias correction using a tool like N4 ( for research workflows and motion correction if you have time series data (eg MCFLIRT from FSL, basically just rigid registration across timepoints).

Some of these tools might not be necessary for deep learning models, and many others could stand to be updated to use the GPU.


(Kerem Turgutlu) #94

These are all really valuable information, thank you so much. We currently have MRI scans for around 300 meningioma patient. Each MRI scan is 124x512x512 (slice, height, width). Our first task is to come up with a model that can auto-contour meningioma tumor. Since we have raw data we will probably do a lot of preprocessing before feeding it into neural nets, such as normalization, skull removal, and others that might be helpful for the task.

I appreciate your help and if you don’t mind may I ask for help for this thread Lung cancer detection; Convolution features + Gradient boost as well. Thanks in advance.


(Andy) #95

I’m in the same task.Follow you yet!


(Fernando Melo) #96

Very interested in this topic. I´m part of Deep Learning Brasilia(Brazil) group and we´re on lesson 6 - part 1.
Jeremy, congratulations for this initiative. And thanks for the oportunity of Fastai Deep Learning course. You´re THE GUY!


(David) #97

I am really interested in this topic! Is it too late to help out?


(Jose Quesada) #98

Is this dataset different from the one hosted at


(Imant Daunhawer) #99

I am very enthusiastic about this topic as well and would greatly appreciate to hear an update from Jeremy, since it sounds like the data has been available for more than a year now.

Our group at the University of Basel is currently working on lung cancer diagnosis (based on CT scans and reports) together with the radiology department of the university hospital and I will draw their attention to this opportunity (and kindly ask the radiologists for their help in labelling).

Further, I’d like to point out that it’s very unfortunate that the data from the Kaggle 2017 Data Science Bowl is not available anymore. In this regard, it would be extremely helpful if at least a sample of the NLST data were provided, to get people started with the preprocessing and replication of models.


(john v) #100


I’m currently exploring the CT scan challenge on Kaggle. I’d love to know where this thread went. Did anyone have any success? Did anyone produce a fastai based CT scan example?



Hey I work in a medical A.I lab at a university and one of our projects is this very topic. Are you still open for collaborations?