Hey, fellow fastai students, I am currently working on Stanford MURA Competition. It is basically an upper extremity X-Ray binary classification dataset. If you are interested, I would love to brainstorm with you.
Here are a few things I have discussed with other students about the competition:
Challenges
The basic training unit is not an individual X-Ray image but an individual X-Ray study that could consist of multiple images, each representing a different view on the same body part. An image does not have a label while a study does.
The images inside the datasets have varying shapes, dimensions, and padding sizes. In a word, they are not close to being standardized.
Countermeasures
Label each individual image according to the label of the study they belong to, then train the model using an individual image as basic training unit.
Pro: Easy to start. ImageDataBunch works out of the box. At inference time, we can just take in all the images of the study, then aggregate them to produce a final result.
Con: It could be the case that even in positive studies, not all the images look positive, i.e., abnormal. The body part might only abnormal from one perspective but perfectly normal in all others. So, it is technically impossible for the model to really tell that the individual image is abnormal. As a result, many of the training data will be mere noises.
Merge all the images in a study into one single image.
Pro: Easy to start. ImageDataBunch works out of the box. Inference time is trivial as well because now the basic training unit is correctly an individual study.
Challenges: What would be a good way to merge the images together?
Build a custom architecture that actually takes in multiple images as input.
Challenge
Architectural design
Studies have varying numbers of images
General Strategies
You can train an end-to-end deep learning system that simply takes in an X-Ray image from an arbitrary body part and tells if it is abnormal.
Since in inference time, the body-part of the image is given, we can train individual CNN models for each body part.
ATTENTION
The leaderboard uses Cohen’s Kappa Coefficient, which is a much stricter metrics than plain accuracy. An accuracy of 0.825 could only give you a kappa of 62.6.
I just uploaded the dataset to Kaggle for your convenience. However, since you actually need to sign a document online to get the dataset, I decided not to make it public so that Stanford won’t sue me I am kidding I know they won’t but just in case.
As a result, now I need to invite you to the dataset in order for you to get access to it. Please reply me with your Kaggle username so I can send the invitations.
Oh, that would be wonderful! We are dying for a domain expert. What would be a good time for us to give you an update on the current status of the work?
I’m interested too. I’ve worked on a generic “multi image input” DataBunch for the “human protein atlas” competition.
It should be tuned to accept RGB images instead of GRAY scale and extended to support “missing” images in the case that not all “views” are present.
Here is some more note I take for myself when playing around with the competition. However, I cannot guarantee its readability as originally I did not intend it to be widely readable to others. You are still welcomed to read it if you find it interesting.