Black and white images on VGG16

For the first week assigment I am trying to do the Facial Expression Recognition Challenge (https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge)
However, I have some problems using VGG16 to grayscale images.
I am using keras 2.0.2 with Tensorflow 1.0.1. I am using directly the keras.application module.

Here is my code.

from keras.applications.vgg16 import VGG16
from keras_tqdm import TQDMNotebookCallback 
from keras.preprocessing import image
from keras.layers import Dense, Dropout, Flatten
from keras.models import Sequential, Model

def get_batches(path, gen=image.ImageDataGenerator(rescale=1./255.), shuffle=True, batch_size=batch_size, class_mode='categorical'):
    """
        Takes the path to a directory, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.

        See Keras documentation: https://keras.io/preprocessing/image/
    """
    global target_size
    return gen.flow_from_directory(path, target_size=target_size,
            class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

#Set constants. You can experiment with no_of_epochs to improve the model
target_size = (48,48)

batch_size=32
no_of_epochs=3

vgg = VGG16(include_top=False, input_shape=(*target_size, 3))

input_layer = vgg.input

for l in vgg.layers:
    l.trainable = False
    
x = vgg.layers[-1].output
x = Flatten()(x)
x = Dense(1000, activation='relu')(x)
x = Dense(7, activation='softmax')(x)

model = Model(inputs=input_layer, outputs=x)

model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

I am reusing the get_batches provided in Vgg16 class with additionally addion rescale=1./255. (As suggested here https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html , I also tried without it)

After fitting the model is not learning anything. Here are my 15 epochs:

{'acc': [0.138671875,
  0.1474609375,
  0.150390625,
  0.1240234375,
  0.1318359375,
  0.140625,
  0.13671875,
  0.13671875,
  0.14844533602297008,
  0.14453125,
  0.15234375,
  0.158203125,
  0.1435546875,
  0.1376953125,
  0.140625],
 'loss': [13.882968783378601,
  13.741306006908417,
  13.69408506155014,
  14.119073778390884,
  13.993151187896729,
  13.851488202810287,
  13.914449602365494,
  13.914449572563171,
  13.725439321313244,
  13.788526982069016,
  13.66260439157486,
  13.56816229224205,
  13.804267376661301,
  13.898709148168564,
  13.851488292217255],
 'val_acc': [0.12385434729128095,
  0.13021618904346435,
  0.13723061679834064,
  0.12831310378994304,
  0.12806539509905898,
  0.1293039385719349,
  0.13252415160141223,
  0.12707456032075828,
  0.13277186029598742,
  0.13425811245974734,
  0.12509289076415683,
  0.13574660633484162,
  0.13301956899056258,
  0.1317810255176867,
  0.13079019073938597],
 'val_loss': [14.121799071885762,
  14.019258416436196,
  13.906199178395662,
  14.049932523797814,
  14.053925060180559,
  14.033962043524085,
  13.982058439096484,
  14.069895437220284,
  13.978065769241859,
  13.954110261559457,
  14.101836206418676,
  13.93011857805209,
  13.974073174981926,
  13.994036282352035,
  14.0100066745107]}

I cannot really see what I am doing wrong here. Any ideas?

Thanks.