Faimed3d - fastai extension for volumetric medical data

Hi all,

I work as a radiologist and used fastai for some research projects. Although fastai now supports medical images, I found that it is still very challenging to apply it to 3d data, such as CT or MRI datasets. So, over the last months I wrote faimed3d, an extension to fastai, which facilitates working with 3d data.
Currently, it is possible to train 3d ResNets and 3d U-Nets using the library. It supports a variety of medical formats including DICOM, DICOM series, NIfTI and more.
I already used the library in some of my research projects and aim to further improve the functionality.
You find the GitHub repository at https://github.com/kbressem/faimed3d and the docs at https://kbressem.github.io/faimed3d/ .
I would be happy about any feedback.


This is really cool, what would be great is a tutorial notebook (maybe something from your research projects if you can share) . I had also started experimenting on extensions to the fastai.medical module but not tailored to 3D here

1 Like

I provide two example on GitHub. They are on classification using the Stanford MR-NET dataset and on segmentation using the Coronacases dataset. Examples for 4D data will come soon.

I’ll also check out you extension, it look very nice.


Thanks!, I see the examples now, cheers for sharing this.

I have updated faimed3d.

Pretrained 3D Models

faimed3d now provides some custom implementations of 3D ResNets and 3D EfficientNets, all pretrained on the UCF-101 dataset for transfer learning. Although the models do not surpass state-of-the-art, they come close with ResNet3D-101 scoring an equivalent of 17th place in the global leaderboard for models without additional training data.


I implemented a Dynamic-UNet-like version of DeepLab for 3D data, which supports all pretrained video models from torchvision and faimed3d as a backbone. DeepLab requires less GPU memory, which is crucial when processing large volumetric data and performs equally for segmentation (at least in my experiments).

DICOM Explorer

Viewing images is very important to understand the data, so faimed3d now provides a simple DICOM viewer implemented as an iPython widget. The viewer allows scrolling through a 3D volume similar to how a radiologist would do, supports windowing and overlay of segmentation masks.

It is now used as standard viewer for show_batch_3d


hey Keno
I am a radiologist myself and working on similar projects. Your library seems very relevant, especially the possibiliuty to use pre-trained models. Seeing the considerable difference in disciplines, I am wondering how well the pre-trained models (trained on videos) work on medical 3d images?

I have not performed ablations, partly because few large medical datasets are available, especially for 3D. However, I think some initialization from the videos is better than no/random initialization.
Transfer learning from ImageNet seems to work quite well when applied to medical images.

@BresNet I want to try Faimed3d. I’m using Colabs. I didn’t see in your documentation how to install.

I have not yet pushed a package to PyPI. You can either clone the repo and make a symlink to the libs, install locally with pip install . or install from GitHub with pip install git+git://github.com/kbressem/faimed3d. Let me know if you run into bugs. I am happy to fix them.

I ended up getting it to work. If anyone is using Colab they will need to:
!git clone GitHub - kbressem/faimed3d: Extension to fastai for volumetric medical data
!pip3 install torch torchvision --upgrade
!pip install -q faimed3d/

1 Like

@BresNet are we able to use PNGs with your library or Jpgs? I have a huge DICOM dataset that is making me run out of RAM even on Colabs Pro and I’ve made the dataset PNGs and cut the size by 70%.

No. In PNG/JPEG, you lose the header information from the DICOM and some voxel information due to conversion to 8bit. I would suggest you read in the DICOM and convert them to NIfTI.

dcm = TensorDicom3D.create(fn)
dcm = Resize3D((20, 224, 224))(dcm) # you can resize to e.g. 20 x 224 x 224 to save space. 

Reading the NIfTI form file is also a lot faster than the DICOM series.

This is amazing. I am also a radiologist keen on medical imaging AI. Thank you for this contribution

1 Like

I am happy you like it. If you find a bug or missing features, please feel free to raise an issue.

@BresNet Just wondering if it is possible to use the current framework of your extension to load a second channel containing segmentation information?

So essentially where 2dCNNs use RGB channels in [R,G,B](+ positional info) per pixel, to use [intensity, segmentation mask](+positional info) per point.

I’m assuming this would require a custom dataloader to be written, but would this sort of approach work with the frame work of fasi-ai + your extension?

This would probably work with just fastai and the DataBlock API.
faimed3d is designed to work with volumetric images, rank 5 Tensors, and 3d CNNs. So using it for 2dCNN would just complicate things.

I think I have not explained myself properly sorry, let me try again.

Many Mri datasets contain multiparametric Mri scans, e.g. FLAIR, t1 t1wCE etc. What I’m wondering is if I can align these different scans, it is possible to have that data go through the 3d CNNs that are in faimed3d?

So instead of feeding in one mri type at a time, feed the different scans in at once through the 3d CNNs. The reason I mentioned 2d CNNs is that with colour images we have RGB channels.

So the question, is it possible to feed in different mri scans together at once such that each scan type, FLAIR and T1wCe, for example might go through the networks together?

In your classification example, if the 3 different axial scans were aligned and on the same plane, would you be able to feed these 3 different inputs into a CNN at once?

( I was using the example of a mask earlier but I think what I actually intend to do with having different mri types as channels makes more sense anyway)

Yes, that is possible. I assume, you are working on a current Kaggle challenge (RSNA MICCAI). For this challenge, I’ve already preprocessed the data. So all images have an identical orientation, spacing, and direction and are easier to read (there are still some small errors in the data, which I aim to correct soon).

The following code would probably work.

DATA_DIR = Path('/path/to/data')

df = pd.read_csv(DATA_DIR/'train_labels.csv')

# dataloaders: 
dls = ImageDataLoaders3D.from_df(df, DATA_DIR/'train', 
                                 fn_col = ['T2w', 'T1wCE', 'T1w', 'FLAIR'], 
                                 label_col = 'MGMT_value', 
                                 item_tfms = ResizeCrop3D((5, 10, 10), (40, 112, 112)), # resize
                                 bs = 8)

# model
model = create_cnn_model_3d(resnet18_3d, 
                            n_in = 4, 
                            n_out = 2, 
                            pretrained = True)

learn = Learner(dls, model, cbs = StackVolumes())

Data: RSNA MICCAI all resampled to axial view | Kaggle

1 Like

Yes I was about to update my comment after I found the fn_col arg in the data.py file.

Thanks for the work, I’m finding it super helpful right now!

1 Like