Statefarm problem - Transfer Learning - shuffle = True/False?

s_j · December 1, 2017, 8:58pm

Hi,

In the statefarm problem, I tried using transfer learning with the help of keras’s vgg16 model. When I shuffle the training data (set shuffle = True), the accuracy I get on training data is abysmal. When shuffle = False, I get a good training accuracy. Why does setting shuffle = True in get_batches() on the train data cause such differences in accuracy? Shouldn’t shuffle=True lead to better accuracy?

The code I have used is as follows:

import numpy as np
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras.applications import VGG16
from keras.utils.np_utils import to_categorical

model = VGG16(include_top=False, weights='imagenet')

batch_size = 64

datagen = ImageDataGenerator(rescale=1. / 255)

def get_batches(dirname, gen = image.ImageDataGenerator(), shuffle=True, batch_size=batch_size):        
    batch_gen = gen.flow_from_directory(dirname, target_size=(224,224), 
            class_mode='categorical', shuffle=shuffle, batch_size=batch_size)    
    num_batch = len(batch_gen)
    return batch_gen, num_batch

generator, num_train_batches = get_batches('./data/train', gen=datagen, shuffle=False)

train_labels = to_categorical(generator.classes)

train_data = model.predict_generator(generator, num_train_batches)

generator, num_valid_batches = get_batches('./data/valid', gen=datagen,
                                                shuffle=False, batch_size=batch_size * 2)

validation_labels = to_categorical(generator.classes)

model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer='adam',
              loss='categorical_crossentropy', metrics=['accuracy'])

model.optimizer.lr = 1e-5

model.fit(train_data, train_labels,
          epochs=5,
          batch_size=batch_size,
          validation_data=(validation_data, validation_labels), verbose=1)

When shuffle=False:-
Train on 17968 samples, validate on 4456 samples
Epoch 1/5
17968/17968 [==============================] - 4s 224us/step - loss: 1.7476 - acc: 0.4617 - val_loss: 1.4700 - val_acc: 0.6914
Epoch 2/5
17968/17968 [==============================] - 4s 197us/step - loss: 0.7867 - acc: 0.8535 - val_loss: 1.0974 - val_acc: 0.7417
Epoch 3/5
17968/17968 [==============================] - 4s 197us/step - loss: 0.4472 - acc: 0.9335 - val_loss: 0.9330 - val_acc: 0.7554
Epoch 4/5
17968/17968 [==============================] - 4s 199us/step - loss: 0.2926 - acc: 0.9613 - val_loss: 0.8638 - val_acc: 0.7655
Epoch 5/5
17968/17968 [==============================] - 4s 197us/step - loss: 0.2131 - acc: 0.9726 - val_loss: 0.8067 - val_acc: 0.7747

When shuffle=True:
Train on 17968 samples, validate on 4456 samples
Epoch 1/5
17968/17968 [==============================] - 4s 215us/step - loss: 2.3914 - acc: 0.0995 - val_loss: 2.2935 - val_acc: 0.1086
Epoch 2/5
17968/17968 [==============================] - 4s 201us/step - loss: 2.3025 - acc: 0.1113 - val_loss: 2.2982 - val_acc: 0.1241
Epoch 3/5
17968/17968 [==============================] - 4s 199us/step - loss: 2.2991 - acc: 0.1182 - val_loss: 2.2997 - val_acc: 0.0866
Epoch 4/5
17968/17968 [==============================] - 4s 203us/step - loss: 2.2961 - acc: 0.1163 - val_loss: 2.2995 - val_acc: 0.0880
Epoch 5/5
17968/17968 [==============================] - 4s 201us/step - loss: 2.2928 - acc: 0.1229 - val_loss: 2.3032 - val_acc: 0.0911

rob · December 2, 2017, 1:24am

Not sure, but you might try running the new version of lesson 1. There is a new version of the course going on right now, and while the videos are not public, the materials are accessible. The new one is based on PyTorch rather than Keras. If you really want to stick with Keras, you could try searching the forums. There is an issue with using shuffle when precomputing activations. I believe that is discussed in later lessons