Notebook COVID-19 x-ray 3.7% error rate

Hi everybody,

I hope everyone is doing fine during these terrible days. I wanted to thank everybody for the great contributions, I am really happy to be part of this amazing community.

As I started the Course 1 recently (though I have advanced Maths and CS background which makes it easier) I thought it would be interesting to try detecting cover from X-rays. Especially since I am Italian and many of my comrades (many more than reported) are suffering from the disease and can’t be tested.

We, computer scientists and AI researchers have the obligation to help others out. Unfortunately, I couldn’t find a single database containing all the images I needed so I had to mix up different databases.

It’s a work in progress and I will be happy to hear any criticism, suggestion or even collaboration ideas.

I think 96% with 240 images collected from different sources is a pretty good start.

Here you will find the Kaggle notebook



Ciao Michele,

it is definitely a good start for better understanding the things Jeremy teaches in Course 1. Unfortunately, in the context of data science things are trickier than a single metric to assess that you are on the right path.
In your experiment, you work with an unbalanced dataset which is usually not a good thing. My advice for you is to try with two datasets out of the original one:
<92 covid_df images, 92 healthy patient images>, and
<92 covid_df images, 92 pneumonia affected patients>
If the results are consistent with what you already have, then you will have further hints that you are doing well :wink:

1 Like

Hi, Michele.

Great work and very good first step. I am also working on this Dataset. Some comments:

As some has pointed before in the original COVID image repository, the pneumonia chest x-ray dataset is not a very good source for cases other than COVID-19, cause it is only from Children. So the model may be learning to differentiate children chests from older peoples chest. You could use the Kaggle RSNA Dataset as they suggested. Also, there are are some patients with more than one X-ray in the COVID-19 Dataset. If the same patient is present both in the train and test set, you may have a data leak. I suggest you select as test images, 12 images from patients that have only one X-ray to prevent that.

Here is the place where you can download the RSNA Dataset:

Hope I could help.

1 Like

Thanks a lot for your answer Fabrizio, I tried already with <92 covid_df images, 92 healthy patient images>, the results were in line with the current. I will try with <92 covid_df images, 92 pneumonia affected patients>. Can you tell me why, or pinpoint the course where I can understand why is it important to try assess the model with multiple metrics?

Thanks this makes a lot of sense. Should have focused more on the data preparation part.
By the way, did you take a look at MIT article?
They say they will provide the dataset which uses 5,941 images taken from 2,839 patient
with various lungs condition!

You’re welcome! About your question, well, I’d say that different metrics are like different lens with which you can gauge/look at your results. This is because, we should always be cautious in ML, actually being critic about our results sounds better: we should have a clear idea of the why we are getting those results in terms of quality of data and model capacity. But, I like to highlight that these concepts need time and practice to make sense. So, I can pinpoint to a nice book you have to (:smile:) read, along with the fastai book. It gives you a good overview about many topics and the author is a very good one:

Hands-On Machine Learning

you can use the free trial to read it online.

Thanks for your answer. I’ve already taken a look at the book and it looks great!

You’ll need the precision , recall and AUC metrics to get a better understanding of how your model is performing, Sensitivity, Specificity PPV and NPV are important metrics used to assess test kits in medicine. Accuracy and error-rate are not good metrics in developing a test kit in medicine.
A bad test kit that struggles to detect the disease in question will still have about 90% accuracy if 1000 people were tested and just 100 have the disease.Read more about them.