Fit_generator() v fit() -- different levels of accuracy

When running DogsVCats, I notice that fit_generator() called by VggBN()'s fit() produces much lower levels of accuracy (train ~80%, val ~50%) as opposed to fit() called by VggBN()'s fit_data() (train and val ~98%). I took the advice of https://github.com/fchollet/keras/issues/2389 and switched to shuffle=True in get_batches() (just for training, not validation) and then the VggBN.fit() started performing as well as fit_data(). I can’t understand why this would make such a big difference…

Assuming you’re using the directory flow, then if you don’t shuffle, the epoch will do all the dogs for half an epoch, and then all the cats. So gradient descent will never be going in the correct direction, it will always be optimizing for either only dogs or only cats.

2 Likes