I'm a radiologist in Canada, relatively new to the deep learning world. I'm leading a group working on detecting manifestations of osteoporosis on images of the spine in the elderly. We have a good labeled dataset, but I think transfer learning with a previously successful model would be helpful. Searching this forum and elsewhere, I get the sense that an imagenet-trained model probably wouldn't have ideal filters for greyscale medical images, and I'm not finding much else to go on. There is a paper published using imagenet converted to greyscale for pretraining, and I think they also tried selecting categories of images that seemed more similar to the features of radiography, so this will probably be our first approach.
My first question is: does anyone know of any available very large datasets or pretrained models that would be useful for transfer learning in medical radiography? If so, I would love to hear about them.
My second question is: would it be feasible and useful to create our own pretrained model from a similar but much larger dataset with simple labels and then finetune it on the labeled dataset of interest?
My radiology group includes a very busy orthopedic and sports medicine practice with hundreds of thousands (I think 500 000 is probably a conservative estimate) of high quality digital radiographs of all parts of the body (but primarily extremities). The storage system knows what body part these come from, and also patient age would be very readily accessible. We might be able to data mine the electronic patient record for a few other easily accessible labels (which patients were sent to the cast clinic for treatment of a fracture seems like an easy one), but that is a separate system that might be harder to access. I'm wondering if the convolutional layers from good network architectures trained on easy to obtain labels, esp. body part and approximate patient age category (which would hopefully capture developmental and degenerative changes), might be useful for all kinds of transfer learning in the field down the road.
If this was something that seems likely to succeed and be useful to the broader community, I would be willing to spearhead a project that would make it happen (get the ethics and privacy approvals, grant funding to put together a multi-gpu server and the like, but probably needing some technical help to make sure things were being done at a state-of-the-art level) and publish the result, then making a set of pretrained networks openly available to the world. Does this seem like something others might be interested in?
Thanks for reading,