What are the reasons for choosing one output dimension for a Dense layer over another?
It seems we’ve been setting the output dimensions of the Dense layers by taking the number of filters in the last convolutional layer and multiplying by 8. Is this true? If so, why the number 8?
I understand that the output dimension of the last Dense layer is the number of classes for the classification task.
Here’s some code I’ve found involving Dense layers:
512 * 8 == 4096
def FCBlock(self):
model = self.model
model.add(Dense(4096, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
[...]
self.ConvBlock(3, 512)
model.add(Flatten())
self.FCBlock()
[...]
64 * 8 == 512
def get_model():
model = Sequential([
Lambda(norm_input, input_shape=(1,28,28)),
Convolution2D(32,3,3, activation='relu'),
Convolution2D(32,3,3, activation='relu'),
MaxPooling2D(),
Convolution2D(64,3,3, activation='relu'),
Convolution2D(64,3,3, activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
Dense(10, activation='softmax')
])
model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
return model
Thank you.