Downloading Luna16 dataset files Files

turns out that downlaoding the Luna files is not easy on remote machine, since dropbox is not availible, torrent is quite complicated when you can’t use bit torrent or something, so you are left with using google drive with command line.
The way I did it was using this package and then the command

drive pull -id [id_1] [id_2] …

where you can put all file id’s in a row.

additionally, use 7zip for unzipping.

3 Likes

That’s a great idea. My approach was much less convenient - I opened the network inspector in firefox as I downloaded each file from dropbox, and used the “copy as curl” right-click menu option.

Thanks for sharing!

This is what I would do normally, but the dropbox version was not available for some reason…

That is great and very easy to download. Thank You for sharing this tip. @shgidi . I just started my download in a Tmux session.

With Ubuntu 16-04 utilities ‘7zip’ exist but also ‘7z’ for working with 7zip files. I found the utility ‘7z’ worked with the luna16 zip files where ‘7zip’ failed because of the named extension.

I have unzipped with ‘7z’ and find I need to further process them to get the images. As noted in the download page there a 10 directories to enable 10-k-fold processing and inside these are pairs of ‘.mhd’ and ‘.raw’ files. My understanding is that the ‘.raw’ files contain 512x512xNumberSlices images. What I don’t know is how to get back the individual slice images for each scan. What I am trying to achieve is to run the notebook from

https://www.kaggle.com/arnavkj95/candidate-generation-and-luna16-preprocessing/notebook/notebook

on my DL box and I am stuck at getting the output for code cell 2

OK I found this which looks like it will help me.

https://github.com/booz-allen-hamilton/DSB3Tutorial.git