Interpreting dice score from 10% vs 100% training data

If you get 0.90 dice score from 10% fo the training data, does it mean 100% of the data should give the same results?

I applied U-net on a segmentation problem using 10% of the training data and achieved a solid dice score (0.90) with 10 epochs. Is this a good hint that the rest of the data should give similar or better dice score or is it something hard to predict?

Getting .9 dice score by using 10% of the data during training doesn’t mean that if using 100% you are going to get the same results (another .9). That’s because in that 90% of the data left could be patterns on data that you haven’t seen in the 10% that you have used to train your model.

What you could do is to perform Cross Validation (CV) stratifying your folds in order to get a clear idea of how is your model performing.

1 Like