Stanford MURA (X-Ray) Classification Competition

Waiting for your Discord group link here. :wink: @soco_loco

Hello everyone, at our meeting today we discussed making a subsection group of Part 2 which will be focused on applying the lessons from each week to a medical theme.

For example, in our prep to Lecture 8 and the beginning of Part 2, we discussed the Stanford MURA xray competition and how to best approach that project.

The goal next week after Lecture 8 would be find an appropriate model to implement or problem to solve that has a medical theme in order to practice the new skills as a team.

The hope is that we can find people from all perspectives and disciplines who are enthusiastic about the medical applications of Fast.AI to collaborate on learning and creating.

Hope to see you there.


Here is my initial exploration notebook just for the understanding dataset.
Looking forward to work on this with you all.


I’ve uploaded an updated version of the MultiChannelImage data loader, but afeter taking a close look to the problem and samples i think that that component is not useful in this case.

I used it in the “Protein Atlas Competition” because in that situation each sample was composed of multiple channels (R,G,B and YELLOW) “linked” together as single image, so same data augmentation transformations should be applied to al the channels.

In this situation we have

  1. multiple images (1…11 AFAIK) per sample
  2. eache image is RGB
  3. different resolution/orientation for each image in sample

So i think that here it’s important to have: an independent data augmentation for each image and the possibility to have variable number of images per sample.

1 Like

Don’t think merging all images into a single image will work well for this. Having an RGB image of frontal, lateral and oblique radiographs overlaid on each other will not be very helpful.

The dataset is also very messy - you will notice some images are inverted due to differences in technique (black on white vs white on black).

There is also a very easy sub problem that might dominate a classification approach without more labeling - many of the “abnormal” xrays have metallic hardware or casts, which are much easier to spot/less clinically relevant than more interesting abnormalities like fractures, tumors, etc.


Images aren’t really RGB, just saved that way for the competition.

1 Like

Interesting point: converting them BW can reduce memory usage!

Except that when you use a CNN model pre-trained on ImageNet, you have to convert it back to RGB again. :thinking:

You don’t have to, you can sum the weights along the channel dimension and change first layer to a 1 channel input.


Yeah, I know they are equivalent. Just don’t find the performance boost worth the effort of customizing the pretrained network. Just me being lazy.


I recommend trying to push each image in a study through a CNN, stacking the outputs of each CNN into a sequence and putting through a transformer or LSTM. If you have the GPU I would recommend transformer offhand because 1) you can remove positional encodings since intuitively the “order of images” doesn’t matter at all while at the same time being able to attend to variable length sequences.

This weekend if I have some extra time I’ll try to put a notebook together illustrating the approach. What sorts of pretrained CNN have you been trying/having success with?

Incidentally if you want to check out my repo here :

You can find a battle-ready implementation of transformer that has been adapted specifically for situation like this (time series forecasting/classification) as opposed to traditional NLP approaches. The classifier has a modified head from the original paper, as well as example of how to use it.

edit: to clarify all you’ll really need is the encoder and head from the existing transformer architecture. and setting the cnn as your embedding model.


Larry Wall’s Three Virtues:

  1. Laziness : The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don’t have to answer so many questions about it.
  2. Impatience : The anger you feel when the computer is being lazy. This makes you write programs that don’t just react to your needs, but actually anticipate them. Or at least pretend to.
  3. Hubris : The quality that makes you write (and maintain) programs that other people won’t want to say bad things about.

Hey all!
I’m not working on medical images but I’m going to work on x-ray images of weld seams soon. I thought it might be useful to try to create a big x-ray dataset from various sources (not necessarily only medical). I could imagine, that features in x-ray images have some unique properties compared to features from pictures in the visible spectrum. Edges are more blurry for example and you always have the effect that parts with a higher depth shine through objects with lower depth.

What do you think? Does it make sense to try to create a generalised model for x-ray specific feature detection and then retrain it for specific applications? Or is the generalised vision in modern imagenet models so good that it will outperform anything that can be transferred to the visible spectrum without loss of information?


Hi. I just published my medium post + jupyter notebook about the MURA competition.

My goal was to assess how far the standard fastai method could go in the search for better accuracy/kappa in the radiology domain and without any knowledge in radiology.

However, to go beyond a kappa of 0.642 (my score with the standard fastai method), I need a more complete understanding of the field of radiology and more DL experiments, I think.

By the way, the following pdf is great in this objective :slight_smile:


Hey @kai, so glad to see you here! I think it is a great idea! Indeed, I myself have been thinking along the same lines. Sometimes I just feel that X-Ray images are special enough to not work as well with images pre-trained on ImageNet as natural pictures. Would love to see how the “X-RayNet” trained model could improve the model’s performance on a specific X-Ray dataset in transfer learning compared to ImageNet model.

Wow, did not know these before. I used to do a lot of premature optimization so I have to teach myself to be lazy nowadays.

Ah, please try out the idea and let us know how it does. Would love to see.

Yes it absolutely makes sense. I know at least one group who is working seriously on this kind of project. They will likely release their pretrained weights eventually on a variety of model architectures.

1 Like

Very interesting! Do you have some more information on this?

Oh! That is exciting! Could you provide some information here?