I am writing to all who have data from Kaggle Data Science Bowl 2017. Can I ask you to join seeding on the official Kaggle torrent files if you still have got the data?
I would very much like to practice and learn by going through the top places solutions and looking at data but there are no seeders currently. I am based in China and direct download is just impossible with Chinese government’s censorship. It would be great if you could join in. Thanks!
Just commenting to say I’m interested in eventually giving some help in the future! Bioengineer with a PhD in image analysis now working in deep learning, so this data is kind of the perfect match! I’ll send this to my academic connections, you never know!
Hi all. The challenge of nodule detection seems to be a step beyond a ‘cats vs dogs’ type situaiton, where you might only be trying to tell the difference between healthy and unhealthy (any condition) CT scans. In that case, I’m guessing you need some sort of 3D CNN, as I believe CT scans are examined by radiologists taking into account the z-axis. I.e. they don’t take each slice at a time (like you might do with 2D radiographs), but instead scroll through and look for object behaviour in all 3 dimensions. Does that sound about right? Thanks.
I am still potentially interested to collaborate with you @jeremy even if I currently have some others active projects. Everything started here last year for me.
If anyone gets through the preprocessing and starts training some models on the data science bowl competition data, please do let us know! That’s the starting point we need to get to before we can start making progress on this.
I am using this kernel to pre-processing the files and it works fine (with a little change as suggested in the comments). But I am not sure if I should get the segment_lung_mask for each image or not. I think the answer is No (because as long as I have the processed images in Numpy array format, I can start running CNN on them). Just want to confirm before moving on…
I’m on part 1 v2 now, and would definitely be very much interested in this project, because I’m also a diagnostic radiologist and this particular competition was exactly what realized me the importance of learning DL more practically
Hi @jeremy, would you please clarify the purpose of this project again? I am very interested, but kinda feel I’m not sure what we want to achieve here.
If the purpose of this project is to use the large dataset you have to train a model and get good performance, then we can just replicate the winner’s model, is that correct? Or you are trying to develop better models with better structures here?
I’d like to improve on the winner’s model, particularly by using more data. But I think replicating their model is a necessary first step, and also really understanding it.