Been working on this for a few hours and can’t figure out why I’m getting a MemoryError exception when I try to run:
fc_model.fit(train_features_conv, train_labels, nb_epoch=8,
batch_size=batch_size, validation_data=(val_features_conv, val_labels))
at the bottom of the code below.
I don’t think it is a legit memory error (batch_size=4), but rather has something to do with how I’m setting the weights for the FC model with Dropout changed from 0.5 to 0.0. Nevertheless, I can’t figure out what I’m doing wrong.
Any help would be appreciative to my sanity.
Here is the relevant code:
model.load_weights(best_weights_f)
model.summary()
____________________________________________________________________________________________________
> Layer (type) Output Shape Param # Connected to
> ====================================================================================================
> lambda_1 (Lambda) (None, 3, 224, 224) 0 lambda_input_1[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_1 (ZeroPadding2D) (None, 3, 226, 226) 0 lambda_1[0][0]
> ____________________________________________________________________________________________________
> convolution2d_1 (Convolution2D) (None, 64, 224, 224) 1792 zeropadding2d_1[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_2 (ZeroPadding2D) (None, 64, 226, 226) 0 convolution2d_1[0][0]
> ____________________________________________________________________________________________________
> convolution2d_2 (Convolution2D) (None, 64, 224, 224) 36928 zeropadding2d_2[0][0]
> ____________________________________________________________________________________________________
> maxpooling2d_1 (MaxPooling2D) (None, 64, 112, 112) 0 convolution2d_2[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_3 (ZeroPadding2D) (None, 64, 114, 114) 0 maxpooling2d_1[0][0]
> ____________________________________________________________________________________________________
> convolution2d_3 (Convolution2D) (None, 128, 112, 112) 73856 zeropadding2d_3[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_4 (ZeroPadding2D) (None, 128, 114, 114) 0 convolution2d_3[0][0]
> ____________________________________________________________________________________________________
> convolution2d_4 (Convolution2D) (None, 128, 112, 112) 147584 zeropadding2d_4[0][0]
> ____________________________________________________________________________________________________
> maxpooling2d_2 (MaxPooling2D) (None, 128, 56, 56) 0 convolution2d_4[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_5 (ZeroPadding2D) (None, 128, 58, 58) 0 maxpooling2d_2[0][0]
> ____________________________________________________________________________________________________
> convolution2d_5 (Convolution2D) (None, 256, 56, 56) 295168 zeropadding2d_5[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_6 (ZeroPadding2D) (None, 256, 58, 58) 0 convolution2d_5[0][0]
> ____________________________________________________________________________________________________
> convolution2d_6 (Convolution2D) (None, 256, 56, 56) 590080 zeropadding2d_6[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_7 (ZeroPadding2D) (None, 256, 58, 58) 0 convolution2d_6[0][0]
> ____________________________________________________________________________________________________
> convolution2d_7 (Convolution2D) (None, 256, 56, 56) 590080 zeropadding2d_7[0][0]
> ____________________________________________________________________________________________________
> maxpooling2d_3 (MaxPooling2D) (None, 256, 28, 28) 0 convolution2d_7[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_8 (ZeroPadding2D) (None, 256, 30, 30) 0 maxpooling2d_3[0][0]
> ____________________________________________________________________________________________________
> convolution2d_8 (Convolution2D) (None, 512, 28, 28) 1180160 zeropadding2d_8[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_9 (ZeroPadding2D) (None, 512, 30, 30) 0 convolution2d_8[0][0]
> ____________________________________________________________________________________________________
> convolution2d_9 (Convolution2D) (None, 512, 28, 28) 2359808 zeropadding2d_9[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_10 (ZeroPadding2D) (None, 512, 30, 30) 0 convolution2d_9[0][0]
> ____________________________________________________________________________________________________
> convolution2d_10 (Convolution2D) (None, 512, 28, 28) 2359808 zeropadding2d_10[0][0]
> ____________________________________________________________________________________________________
> maxpooling2d_4 (MaxPooling2D) (None, 512, 14, 14) 0 convolution2d_10[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_11 (ZeroPadding2D) (None, 512, 16, 16) 0 maxpooling2d_4[0][0]
> ____________________________________________________________________________________________________
> convolution2d_11 (Convolution2D) (None, 512, 14, 14) 2359808 zeropadding2d_11[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_12 (ZeroPadding2D) (None, 512, 16, 16) 0 convolution2d_11[0][0]
> ____________________________________________________________________________________________________
> convolution2d_12 (Convolution2D) (None, 512, 14, 14) 2359808 zeropadding2d_12[0][0]
> ____________________________________________________________________________________________________
> zeropadding2d_13 (ZeroPadding2D) (None, 512, 16, 16) 0 convolution2d_12[0][0]
> ____________________________________________________________________________________________________
> convolution2d_13 (Convolution2D) (None, 512, 14, 14) 2359808 zeropadding2d_13[0][0]
> ____________________________________________________________________________________________________
> maxpooling2d_5 (MaxPooling2D) (None, 512, 7, 7) 0 convolution2d_13[0][0]
> ____________________________________________________________________________________________________
> flatten_1 (Flatten) (None, 25088) 0 maxpooling2d_5[0][0]
> ____________________________________________________________________________________________________
> dense_1 (Dense) (None, 4096) 102764544 flatten_1[0][0]
> ____________________________________________________________________________________________________
> batchnormalization_1 (BatchNorma (None, 4096) 16384 dense_1[0][0]
> ____________________________________________________________________________________________________
> dropout_1 (Dropout) (None, 4096) 0 batchnormalization_1[0][0]
> ____________________________________________________________________________________________________
> dense_2 (Dense) (None, 4096) 16781312 dropout_1[0][0]
> ____________________________________________________________________________________________________
> batchnormalization_2 (BatchNorma (None, 4096) 16384 dense_2[0][0]
> ____________________________________________________________________________________________________
> dropout_2 (Dropout) (None, 4096) 0 batchnormalization_2[0][0]
> ____________________________________________________________________________________________________
> dense_4 (Dense) (None, 2) 8194 dropout_2[0][0]
> ====================================================================================================
> Total params: 134,301,506
> Trainable params: 8,194
> Non-trainable params: 134,293,312
> ____________________________________________________________________________________________________
last_conv_idx = [i for i,l in enumerate(model.layers) if type(l) == Convolution2D][-1]
conv_layers = model.layers[:last_conv_idx+1]
conv_model = Sequential(conv_layers)
fc_layers = model.layers[last_conv_idx+1:]
train_features_conv = conv_model.predict(train_data, batch_size=batch_size)
val_features_conv = conv_model.predict(val_data, batch_size=batch_size*2)
print(train_features_conv.shape, val_features_conv.shape)
save_array(cache_path+'train_features_conv.dat', train_features_conv)
save_array(cache_path+'val_features_conv.dat', val_features_conv)
train_features_conv = load_array(cache_path+'train_features_conv.dat')
val_features_conv = load_array(cache_path+'val_features_conv.dat')
def set_layer_weights(layer, new_p, old_p):
scale = (1-old_p)/(1-new_p)
return [ o * scale for o in layer.get_weights() ]
def build_fc_model(new_p, old_p):
model = Sequential([
MaxPooling2D((2, 2), strides=(2, 2), input_shape=conv_layers[-1].output_shape[1:]),
Flatten(),
Dense(4096, activation='relu'),
BatchNormalization(),
Dropout(new_p),
Dense(4096, activation='relu'),
BatchNormalization(),
Dropout(new_p),
Dense(2, activation='softmax')
])
for l1,l2 in zip(model.layers, fc_layers): l1.set_weights(set_layer_weights(l2, new_p, old_p))
opt = Adam(lr=0.00001)
model.compile(opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
fc_model = build_fc_model(0.0, 0.5)
fc_model.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
maxpooling2d_6 (MaxPooling2D) (None, 512, 7, 7) 0 maxpooling2d_input_1[0][0]
____________________________________________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0 maxpooling2d_6[0][0]
____________________________________________________________________________________________________
dense_5 (Dense) (None, 4096) 102764544 flatten_2[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNorma (None, 4096) 16384 dense_5[0][0]
____________________________________________________________________________________________________
dropout_3 (Dropout) (None, 4096) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
dense_6 (Dense) (None, 4096) 16781312 dropout_3[0][0]
____________________________________________________________________________________________________
batchnormalization_4 (BatchNorma (None, 4096) 16384 dense_6[0][0]
____________________________________________________________________________________________________
dropout_4 (Dropout) (None, 4096) 0 batchnormalization_4[0][0]
____________________________________________________________________________________________________
dense_7 (Dense) (None, 2) 8194 dropout_4[0][0]
====================================================================================================
Total params: 119,586,818
Trainable params: 119,570,434
Non-trainable params: 16,384
____________________________________________________________________________________________________
fc_model.fit(train_features_conv, train_labels, nb_epoch=8,
batch_size=batch_size, validation_data=(val_features_conv, val_labels))