If I have some model with Dropout layers with p = 0.5, I then recreate the model, set p = 0, divide weights on Dense layers by 2, and run evaluate on the model, I should get exactly the same categorical cross entropy loss as on the earlier model?

This is not what my code produces hence wanted to check if my understanding is correct. In my code I do something like in the gist here

I guess the sanity check on dropout changes would be the below? Assuming I properly made changes to dropout the below should give me a True? (getting a False atm with code like in the gist above)

fc = model_fc.predict(val_X)
fc_zero = model_fc_p_zero.predict(val_X)
np.allclose(fc, fc_zero)

As expected though (at test time the weights of the model with p = 0.5 should be getting halved, thus):

a = model_fc.get_weights()
b = model_fc_p_zero.get_weights()
np.allclose(a[0], 2 * b[0]) # => True
np.allclose(a[1], 2 * b[1]) # => True

I’m not sure about the use of the Lambda function there. It should be fine, but here’s what I did:

def proc_wgts(layer, dropout):
return [o*(0.5/(1-dropout)) for o in layer.get_weights()]

Which I then call using

if (dropout != 0.5):
last_conv_idx = [index for index,layer in enumerate(model.layers)
if type(layer) is Convolution2D][-1]
print('Updating vgg model layers after index ', last_conv_idx)
for l1,l2 in zip(model.layers[last_conv_idx+1:], model.layers[last_conv_idx+1:]): l1.set_weights(proc_wgts(l2,dropout))

Pretty sure when you run evaluate on a model with dropout, it automatically sets dropout to zero and rescales weights for every layer appropriately. (I haven’t actually checked but that was my understanding of using dropout layers.) You can get a sense that this is happening by observing that training loss increases with higher dropout but validation doesn’t (might even go down with better generalizability) - the model is automatically switching dropout on and off for you in different training and testing modes.

So it shouldn’t matter what p is set to for evaluation, only training.