How could I release gpu memory of keras

Training models with kcross validation(5 cross), using tensorflow as back end.

Every time the program start to train the last model, keras always complain it is running out of memory, I call gc after every model are trained, any idea how to release the memory of gpu occupied by keras?

for i, (train, validate) in enumerate(skf):
	model, im_dim = mc.generate_model(parsed_json["keras_model"], parsed_json["custom_model"], parsed_json["top_model_index"], parsed_json["learning_rate"])
	training_data = _generate_train_data(im_dim, parsed_json["train_dir"], int(parsed_json["train_batch_size"]))
	testing_data = _generate_train_data(im_dim, parsed_json["val_dir"], int(parsed_json["test_batch_size"]))

	checkpoint = ModelCheckpoint(parsed_json["trained_model_name"] + "cross" + str(index) + ".h5", monitor='val_loss', verbose=0, 
                                 save_best_only=True, save_weights_only=False, mode='auto', period=1)
	index = index + 1
	callback_list = [checkpoint]
	start_time = time.time()

	training_history = model.fit_generator(training_data,
                                           nb_epoch = int(parsed_json["nb_epoch"]),
                                           callbacks = callback_list)

	gc.collect() //cross my finger and beg gc can help me release memory

Is your last model the exact same as previous models? Or is it a different architecture?

same model, change nothing, to help gc release, I add one more line before I call it

model = None

Any recommended way to force python release memory? I search by google, some posts say gc.collect() will force gc of python release non-reference memory, but it do not work in this case, wonder what happen, do keras allocate memory globally?

Try del model That should immediately remove the model variable from memory.

Thanks, I try this

del model

Still have no luck. do keras got memory leak issue?

Can you copy and paste the error message here?

I was having exactly same issue. I solved by also deleting the training_history variable. Here is my piece of code that doesn’t have memory leakage. Looks like training_history keeps some pointer to GPU memory.

for fold, (train_clusters_index, valid_clusters_index) in enumerate(kf):
	model = create_model(TRAINABLE_VGG, include_top=INCLUDE_TOP)

	train_fold_id = folds.index[folds.isin(clusters[train_clusters_index])]
	valid_fold_id = folds.index[folds.isin(clusters[valid_clusters_index])]

	train_index = np.where(np.in1d(train_id, train_fold_id))[0]
	valid_index = np.where(np.in1d(train_id, valid_fold_id))[0]

	if SAMPLE < 1:
		train_index = subsample(train_index, SAMPLE)
		valid_index = subsample(valid_index, SAMPLE)

	X_train = train_data[train_index]
	Y_train = train_target[train_index]
	X_valid = train_data[valid_index]
	Y_valid = train_target[valid_index]

	base_model = Y_train.mean(axis=0)
	train_baseline = log_loss(Y_train, np.tile(base_model, len(Y_train)).reshape(-1, (len(base_model))))
	valid_baseline = log_loss(Y_valid, np.tile(base_model, len(Y_valid)).reshape(-1, (len(base_model))))

	num_fold += 1
	print('Start KFold number {} from {}'.format(num_fold, num_folds))
	print('Split train:', len(X_train), 'Baseline: %7.5f' % train_baseline)
	print('Split valid:', len(X_valid), 'Baseline: %7.5f' % valid_baseline)

	model_file = join(MODEL_FOLDER, 'nn_fold_%02d.model' % fold)
	callbacks = [
		EarlyStopping(monitor='val_loss', patience=5, verbose=0),
		ModelCheckpoint(model_file, save_best_only=True, save_weights_only=True)
	# fits the model on batches with real-time data augmentation:
	history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
			samples_per_epoch=len(X_train), nb_epoch=nb_epoch,
			validation_data=(X_valid, Y_valid),


	# Load best epoch

	predictions_valid = model.predict(X_valid, batch_size=batch_size, verbose=2)
	score = log_loss(Y_valid, predictions_valid)
	print('Score log_loss: ', score)
	sum_score += score*len(valid_index)

	# Store valid predictions
	for i in range(len(valid_index)):
		yfull_train[valid_index[i]] = predictions_valid[i]

	print('Start test KFold number {} from {}'.format(num_fold, num_folds))
	test_prediction = model.predict(test_data, batch_size=batch_size, verbose=2)
	del history
	del model

score = sum_score/len(train_data)
print("Log_loss train independent avg: ", score)
1 Like

I had a similar issue with theano. I am not sure how it relates to tensorflow, if it could be something similar.

I had the problem when using the libgpuarray backend, when I changed device configuration in .theanorc from cudo to gpu, keras and theano released memory when I called gc.collect().

Someone said elsewhere that theano can be slow to release the references and recommended to call gc.collect() a couple of times,

for i in range(3): gc.collect()


I ran into similar problems. I found that deleting the model variable and calling the gc.collect() helped. The behaviour of GPU memory getting freed still seems somewhat random. At times I had to call gc.collect() 12-15 times to free memory.

I used these as well to help with his issue:

fuser -v /dev/nvidia*
sudo pkill -f ipykernel

1 Like

from keras import backend as K

(after you are done with the model)


solves the memory issue.


Thanks a lot.

1 Like

could you please tell me when and where I should include the above line ? after model.compile() or after model.predict() ?

Yadu, please first don’t implement any K.clear_session() line, just train your model. Then when your training process is finished, if you are going to compose a new model just run K.clear_session() to clear the memory. Or you can use each model with a graph:

import tensorflow as tf

model1 = (load/define first model)
graph1 = tf.get_default_graph()

model2 = (load/define the second model)
graph2 = tf.get_default_graph()

with graph1.as_default():
    model1.predict(X) (or training)

with graph2.as_default():
    model2.predict(X) (or training)
1 Like

Thanks :slight_smile:

1 Like

Prevents tensorflow from using up the whole gpu

import tensorflow as tf
config = tf.ConfigProto()
sess = tf.Session(config=config)

This code helped me to come over the problem of GPU memory not releasing after the process is over. Run this code at the start of your program. Thanks.

I had the same problem. I created this function which works every time now. It releases the GPU memory. I call these before each run.

from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow

# Reset Keras Session
def reset_keras():
    sess = get_session()
    sess = get_session()

        del classifier # this is from global space - change this as you need

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tensorflow.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
1 Like

I had the same issue. For me, the solution was del the input array of train, the input array of test and the model, then i clear session of keras and call gc.collect().

from keras import backend as bek

del X
del XT
del model

deletes and calls after train model.