Oversampling gives worse results

I have this particular data distribution.

To improve my training results I thought I would use the OverSamplingCallback to re-balance the distribution. But when I use it, I get significantly worse results.

Assuming I’ve chosen good learning rates, and kept all the hyperparameters constant, is there any reason why that would be?

FYI I am using this in a regression problem rather than as a classification problem. Would this have any affect?

The following is the Oversampling callback being used:

from torch.utils.data.sampler import WeightedRandomSampler

class OverSamplingCallback(LearnerCallback):
    def __init__(self,learn:Learner,weights:torch.Tensor=None):
        self.labels = self.learn.data.train_dl.dataset.y.items
        _, counts = np.unique(self.labels,return_counts=True)
        self.weights = (weights if weights is not None else
        self.label_counts = np.bincount([self.learn.data.train_dl.dataset.y[i].data for i in range(len(self.learn.data.train_dl.dataset))])
        self.total_len_oversample = int(self.learn.data.c*np.max(self.label_counts))
    def on_train_begin(self, **kwargs):
        self.learn.data.train_dl.dl.batch_sampler = BatchSampler(WeightedRandomSampler(self.weights,self.total_len_oversample), self.learn.data.train_dl.batch_size,False)

Over sampling is usually done on classification data, not regression. I’m curious to why you want to do this for regression?

The best way would be look at some of the kaggle competitions with unbalanced regression problems. There are some but I am not remembering their name right now.

That’s the bit I wasn’t certain on actually. If oversampling was useful for regression problems? For context I am working on the 2019 diabetic retinopathy competition on Kaggle.

I am currently getting better results treating it as a regression problem rather than as a classification problem, so I wanted to see if I could improve on that using oversampling.

But maybe oversampling is useful only if you are dealing with a classification problem?