The datasets in the RSNA Screening Mammography Breast Cancer Detection competition on kaggle includes attributes such as machine_id or Laterality. As a beginner, I’d appreciate any guidance or pointers on how these can be incorporated into an image classifier. My googling on the topic is raising more questions than answers
If the answer is to use multi-label classification such as “Y 49” to indicate a cancer diagnosis from machine 49, then how do you get a prediction that only considers the cancer diagnosis? Similarly how do you get a prediction that only considers the cancer diagnosis if a single label classification is used?
How should you proceed if you take an ensemble approach, create a separate model for each machine_id and store the trained learned objects in a list? (This seems to be the most logical approach to me)
Is it possible to incorporate some form of control structure such as “ if … then” or “select case” such that a particular trained learner object is used depending on the machine_id?
On Kaggle, I found @jeremy’ s notebooks from 3Y very helpful while attempting RSNA Screening Mammography Breast Cancer Detection. It walks thru similar project: RSNA Intracranial Hemorrhage Detection (RSNA Intracranial Hemorrhage Detection | Kaggle).
I never done it, I am not sure this can be directly done with fastai, however, search for “multimodal deep learning” that is the term to describe learning from mixed kinds of data.
In your case, the underlying idea is that different machines might have slightly different renderings of the same case.
Hi Vincenzo
Many thanks for the suggestion - much appreciated
I created an approach that did what I was trying to do but it scored badly for both machine_id and age
All the best
Julian
If it is just classification and not localization, the silliest, yet not totally absurd solution could be to “print” data in some unused corners of the image. They will provide some extra features, but careful about biases.