So this is my batch_norm fn:
def bn_get_layers(p=0.):
layers = [MaxPooling2D(input_shape=conv_layers[-1].output_shape[1:]),
Flatten(),
Dense(4096,activation='relu'),
Dropout(p),
BatchNormalization(mode=2),
Dense(4096,activation='relu'),
Dropout(p),
BatchNormalization(mode=2),
Dense(10,activation='softmax')]
return layers
Reading your state farm sample notebook, I noticed axis=1 is only present in the convolution part and not the dense layer. BatchNormalization(axis = 1) when used on convolutional layers