Stanford MURA (X-Ray) Classification Competition

PegasusWithoutWinds · March 6, 2019, 4:57pm

Hey, fellow fastai students, I am currently working on Stanford MURA Competition. It is basically an upper extremity X-Ray binary classification dataset. If you are interested, I would love to brainstorm with you.

Here are a few things I have discussed with other students about the competition:

Challenges

The basic training unit is not an individual X-Ray image but an individual X-Ray study that could consist of multiple images, each representing a different view on the same body part. An image does not have a label while a study does.
The images inside the datasets have varying shapes, dimensions, and padding sizes. In a word, they are not close to being standardized.

Countermeasures

Label each individual image according to the label of the study they belong to, then train the model using an individual image as basic training unit.
- Pro: Easy to start. ImageDataBunch works out of the box. At inference time, we can just take in all the images of the study, then aggregate them to produce a final result.
- Con: It could be the case that even in positive studies, not all the images look positive, i.e., abnormal. The body part might only abnormal from one perspective but perfectly normal in all others. So, it is technically impossible for the model to really tell that the individual image is abnormal. As a result, many of the training data will be mere noises.
Merge all the images in a study into one single image.
- Pro: Easy to start. ImageDataBunch works out of the box. Inference time is trivial as well because now the basic training unit is correctly an individual study.
- Challenges: What would be a good way to merge the images together?
Build a custom architecture that actually takes in multiple images as input.
- Challenge
  - Architectural design
  - Studies have varying numbers of images

General Strategies

You can train an end-to-end deep learning system that simply takes in an X-Ray image from an arbitrary body part and tells if it is abnormal.
Since in inference time, the body-part of the image is given, we can train individual CNN models for each body part.

ATTENTION

The leaderboard uses Cohen’s Kappa Coefficient, which is a much stricter metrics than plain accuracy. An accuracy of 0.825 could only give you a kappa of 62.6.

Kappa is implemented in fastai: https://docs.fast.ai/metrics.html#KappaScore.

Kaggle Dataset

I just uploaded the dataset to Kaggle for your convenience. However, since you actually need to sign a document online to get the dataset, I decided not to make it public so that Stanford won’t sue me I am kidding I know they won’t but just in case.

As a result, now I need to invite you to the dataset in order for you to get access to it. Please reply me with your Kaggle username so I can send the invitations.

rsrivastava · March 6, 2019, 9:48pm

Thanks @ PegasusWithoutWinds I am very much interested in brainstorming and collaborating with you in this competition.

PegasusWithoutWinds · March 7, 2019, 4:43am

Hey, @rsrivastava, what time zone are you in? We can have an audio call to get it kickstarted.

rsrivastava · March 7, 2019, 5:06am

I am in PST timezone. We can do whatsapp call.

agentili · March 14, 2019, 4:22am

I am also interested, and would like to participate. I am also in the PST. I can bring some domain expertise as a musculoskeletal radiologist.

PegasusWithoutWinds · March 14, 2019, 12:08pm

Oh, that would be wonderful! We are dying for a domain expert. What would be a good time for us to give you an update on the current status of the work?

agentili · March 15, 2019, 1:54am

I am available most evening 6-10pm PST.

PegasusWithoutWinds · March 15, 2019, 3:34am

@rsrivastava Would you like to join?

rsrivastava · March 15, 2019, 4:44am

@PegasusWithoutWinds and @agentili Yes I would like to join. When are we meeting and how.

PoonamV · March 15, 2019, 5:21am

I’m interested too.

PegasusWithoutWinds · March 15, 2019, 1:58pm

Ah, you are most welcomed to join! How could we reach you?

MagnIeeT · March 16, 2019, 4:10pm

I will love to contribute if somehow I can help. I don’t have much domain expertise in this field. But I have some knowledge of deep learning.

Shubhajit · March 16, 2019, 5:15pm

I am interested, how can I join you?

PegasusWithoutWinds · March 17, 2019, 2:48am

So glad to see all the enthusiasm out there! Here is a Google Hangout invitation link.

https://hangouts.google.com/call/ImVgSmbBmlF_QP8J6uwkAEEE

We can use it to start our meeting at 8:30 pm on 03/17 PST. Let me know if Google Hangout does not work for you.

ste · March 17, 2019, 4:21am

I’m interested too. I’ve worked on a generic “multi image input” DataBunch for the “human protein atlas” competition.
It should be tuned to accept RGB images instead of GRAY scale and extended to support “missing” images in the case that not all “views” are present.

github.com

artste/fastai-samples/blob/master/kaggle/lesson2-protein-human-protein-atlas-384-resnet50_data_block.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lesson 2 - Contribute to human protein atlas with data block api"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try to address the uman protein atlas competition using concepts from lesson2/v3\n",
    "\n",
    "https://www.kaggle.com/c/human-protein-atlas-image-classification/"
   ]
  },
  {
   "cell_type": "code",

This file has been truncated. show original

PegasusWithoutWinds · March 17, 2019, 5:02am

Ah, this is great!

PegasusWithoutWinds · March 17, 2019, 11:01am

Here is some more note I take for myself when playing around with the competition. However, I cannot guarantee its readability as originally I did not intend it to be widely readable to others. You are still welcomed to read it if you find it interesting.

Stanford MURA Competition.pdf (194.7 KB)

ste · March 18, 2019, 2:35am

I’ve updated the code to work with fast.ai 1.0.50.dev0 .

PegasusWithoutWinds · March 18, 2019, 3:23am

I am already here in the Hangout.

As a reminder, the link to join is:

PegasusWithoutWinds · March 18, 2019, 4:26am

Just in case any of you are interested in building the whole thing from raw PyTorch, here is a repo that might serve as a starting point: