Batch size effect on validation accuracy

altvali · March 11, 2017, 10:49pm

Hello,

I’ve stumbled upon a very strange situation where the batch_size is the major factor in the validation set accuracy of my model. Proof:

model.fit(conv_trn_feat, trn_labels, nb_epoch=1, batch_size=256, validation_data=(conv_val_feat, val_labels))
Train on 2001 samples, validate on 500 samples
Epoch 1/1
2001/2001 [==============================] - 6s - loss: 0.0546 - acc: 0.9845 - val_loss: 1.4992 - val_acc: 0.7700
model.fit(conv_trn_feat, trn_labels, nb_epoch=1, batch_size=128, validation_data=(conv_val_feat, val_labels))
Train on 2001 samples, validate on 500 samples
Epoch 1/1
2001/2001 [==============================] - 9s - loss: 0.0439 - acc: 0.9835 - val_loss: 4.2225 - val_acc: 0.4520
model.fit(conv_trn_feat, trn_labels, nb_epoch=1, batch_size=256, validation_data=(conv_val_feat, val_labels))
Train on 2001 samples, validate on 500 samples
Epoch 1/1
2001/2001 [==============================] - 8s - loss: 0.0291 - acc: 0.9870 - val_loss: 1.4931 - val_acc: 0.7600
model.fit(conv_trn_feat, trn_labels, nb_epoch=1, batch_size=512, validation_data=(conv_val_feat, val_labels))
Train on 2001 samples, validate on 500 samples
Epoch 1/1
2001/2001 [==============================] - 8s - loss: 0.0183 - acc: 0.9940 - val_loss: 0.0129 - val_acc: 0.9960

I can achieve 99.6% validation accuracy in less than 10 epochs of training with batch_size=512, but batch_size=128 I can’t get the validation accuracy past 48% even after hundreds of epochs of training and even if I use the same weights that I used to train the model with batch_size=512. In fact, even model.evaluate() gives me numbers in the same ballpack as the ones above, depending on what batch_size I feed it.

What can I do if I want to deploy this model on something that doesn’t have the RAM to handle batch_size=512?