How to split data at patient level instead of image level?

sky1ove · October 18, 2020, 9:18pm

Hi,

I trained patient’s dicom files for classification problem. Each patient have a list of image files. During training, I found the validation accuracy is pretty high (to 99%). I realized that is because that the data is splitted based on images instead of patients, which leads to cheating.

How to split data at patient level instead of image level? Thanks!

amritv · October 19, 2020, 1:06am

You could have a look at this https://github.com/fastai/fastai/issues/2724