Adding weighted sampler

ilovescience · August 6, 2019, 12:31am

Another way to work it is to do this:

learn = cnn_learner(data,models.resnet50,metrics=[accuracy],callback_fns=[OverSamplingCallback])

I tested it on MNIST over here and showed oversampling improved results on an imbalanced version of the dataset, but of course was still worse than training on the original MNIST.

I think this has more to do with the fact that accuracy is a bad metric for unbalanced datasets. For example, if a dataset is 80% class 1 and 20% class 0, if it predicts class 1 always, accuracy is already up at 80%, but if it is oversampled, then the same approach will only yield 50% accuracy. It might be necessary to use a different metric. For example, you could use an F1 metric.