Medical Imaging

Have been extremely interested in using fastai’s medical imaging library for a while so I attempted to complete a user guide (and hence improve my knowledge of how to use this library) for notebook 60_medical.imaging.ipynb.

Notebook blogs:

The SIIM_Slim dataset is a great dataset included in fastai datasets to help start with playing around with DICOMs quickly.

The guide also includes an intro into DICOMs and I have also included a starter notebook that looks at some high level considerations when modeling and evaluating models for medical diagnosis. I also customized a number of functions for example:

  • DICOMs allow us to check if the same patient ID so we can easily check to see if the train and validation sets have the same patient ID which could result in an overly optimistic test set.
  • Check to see how many ‘normal’ and ‘disease’ images are in train/validation sets and help to work with unbalanced datasets (work in progress)
  • Evaluating metrics on model performance such as specificity, sensitivity, PPV, NPV which are all important when determining how a model is doing.

Still in the pipeline:

  • The SIIM_Slim dataset is limited in that it does not some DICOM attributes that are needed in order to fully utilize the library such as Rescale Intercept and Rescale Slope. So this is a work in progress. Getting to know DICOMS
  • Provide a more robust end-to-end notebook

If anyone has worked with this library it would be great to have some insights so that we can get better documentation. There have been a number of sources with the best being Jeremy’s Kaggle notebooks of which 5 are specific for medical imaging and here


Just finished another blog around DICOMs Getting to know DICOMS and I had to customize the show function so that it would correctly display files with multiple frames

Some DICOM datasets have more than 1 frame per file. If this is the case the show function throws a TypeError Invalid shape

To fix this I patched a way so that the show function will check to see if the .dcm file has more than 1 frame and if the file has multiple frames you can specify how many frames to view

@delegates(show_image, show_images)
def show(self:DcmDataset, frames=1, scale=True,, min_px=-1100, max_px=None, **kwargs):
    px = (self.windowed(*scale) if isinstance(scale,tuple)
          else self.hist_scaled(min_px=min_px,max_px=max_px,brks=scale) if isinstance(scale,(ndarray,Tensor))
          else self.hist_scaled(min_px=min_px,max_px=max_px) if scale
          else self.scaled_px)
    if px.ndim > 2: 
        p = px.shape; print(f'{p[0]} frames per file')
        for i in range(frames): u = px[i]; gh.append(u)
        show_images(gh, cmap=cmap, **kwargs)    
        print('1 frame per file')
        show_image(px, cmap=cmap, **kwargs)

The other I thing I was surprised by is that show_images does not currently allow you to pass in a colormap but a modification fixed that

def show_images(ims, nrows=1, ncols=None, titles=None, cmap=None, **kwargs):
     "Show all images `ims` as subplots with `rows` using `titles`"
     if ncols is None: ncols = int(math.ceil(len(ims)/nrows))
     if titles is None: titles = [None]*len(ims)
     axs = subplots(nrows, ncols, **kwargs)[1].flat
     for im,t,ax in zip(ims, titles, axs): show_image(im, ax=ax, title=t, cmap=cmap)