Effect of unbalanced batches on learning

pdoerschuk · March 30, 2017, 9:56pm

It appears that get_batches creates batches that may have an uneven number of examples of each class. For instance, when using batches of size 4 in an experiment, the first training batch that was created had 1 example of class 1 and 3 examples of class 2. Won’t this make it more difficult to correctly classify class 1 examples and easier to correctly classify class 2 examples?
Thanks very much for your help!

rteja1113 · May 20, 2017, 6:30am

Hi @pdoerschuk , if you have an unbalanced class dataset(suppose 200 of class-1, 50 of class-2), lets suppose the batch_size be 25(which means 10 batches per epoch). As per your suggestion, if we use almost equal number of both classes, after 4 batches you will have run out of class-2 images.

We use random batches because we reasonably assume that it has similar class distribution as the entire dataset.

pdoerschuk · May 20, 2017, 10:35pm

Hi, Ravi.
Thanks very much for your response. I agree that it is reasonable to assume that the distribution of examples in the training set is similar to the distribution in the entire dataset. I understand that generating random batches satisfies this. However, I still believe that an uneven distribution of examples from each class can affect accuracy. For instance, if we have 90% of our training examples in class 0 and 10% in class 1, if the net always outputs class 1, it will achieve a 90% accuracy, but that does not mean that it is correctly distinguishing between the 2 classes. The only treatment of this that I have found thus far is to include class weights in the loss function. Class weights can be included as a parameter to the fit_generator function. Please see http://stackoverflow.com/questions/42586475/is-it-possible-to-automatically-infer-the-class-weight-from-flow-from-directory for an example of how to directly infer the class_weight argument from the ImageDataGenerator object, and see comments of cbaziotis at https://github.com/fchollet/keras/issues/5116 for an example that uses a smoothing function. I am experimenting with this now.
Peggy

rteja1113 · May 21, 2017, 12:35am

Wow , I didn’t realize keras has class weights option.However, the network tries to lower the loss which in turn indirectly increases accuracy.It doesn’t directly try to optimize accuracy.So even if you have 90% accuracy you may have big loss which prevents the network from being deceived.