How could I release gpu memory of keras


Training models with kcross validation(5 cross), using tensorflow as back end.

Every time the program start to train the last model, keras always complain it is running out of memory, I call gc after every model are trained, any idea how to release the memory of gpu occupied by keras?

for i, (train, validate) in enumerate(skf):
	model, im_dim = mc.generate_model(parsed_json["keras_model"], parsed_json["custom_model"], parsed_json["top_model_index"], parsed_json["learning_rate"])
	training_data = _generate_train_data(im_dim, parsed_json["train_dir"], int(parsed_json["train_batch_size"]))
	testing_data = _generate_train_data(im_dim, parsed_json["val_dir"], int(parsed_json["test_batch_size"]))

	checkpoint = ModelCheckpoint(parsed_json["trained_model_name"] + "cross" + str(index) + ".h5", monitor='val_loss', verbose=0, 
                                 save_best_only=True, save_weights_only=False, mode='auto', period=1)
	index = index + 1
	callback_list = [checkpoint]
	start_time = time.time()

	training_history = model.fit_generator(training_data,
                                           nb_epoch = int(parsed_json["nb_epoch"]),
                                           callbacks = callback_list)

	gc.collect() //cross my finger and beg gc can help me release memory


Is your last model the exact same as previous models? Or is it a different architecture?


same model, change nothing, to help gc release, I add one more line before I call it

model = None

Any recommended way to force python release memory? I search by google, some posts say gc.collect() will force gc of python release non-reference memory, but it do not work in this case, wonder what happen, do keras allocate memory globally?


Try del model That should immediately remove the model variable from memory.


Thanks, I try this

del model

Still have no luck. do keras got memory leak issue?


Can you copy and paste the error message here?

(Vitor A Batista) #7

I was having exactly same issue. I solved by also deleting the training_history variable. Here is my piece of code that doesn’t have memory leakage. Looks like training_history keeps some pointer to GPU memory.

for fold, (train_clusters_index, valid_clusters_index) in enumerate(kf):
	model = create_model(TRAINABLE_VGG, include_top=INCLUDE_TOP)

	train_fold_id = folds.index[folds.isin(clusters[train_clusters_index])]
	valid_fold_id = folds.index[folds.isin(clusters[valid_clusters_index])]

	train_index = np.where(np.in1d(train_id, train_fold_id))[0]
	valid_index = np.where(np.in1d(train_id, valid_fold_id))[0]

	if SAMPLE < 1:
		train_index = subsample(train_index, SAMPLE)
		valid_index = subsample(valid_index, SAMPLE)

	X_train = train_data[train_index]
	Y_train = train_target[train_index]
	X_valid = train_data[valid_index]
	Y_valid = train_target[valid_index]

	base_model = Y_train.mean(axis=0)
	train_baseline = log_loss(Y_train, np.tile(base_model, len(Y_train)).reshape(-1, (len(base_model))))
	valid_baseline = log_loss(Y_valid, np.tile(base_model, len(Y_valid)).reshape(-1, (len(base_model))))

	num_fold += 1
	print('Start KFold number {} from {}'.format(num_fold, num_folds))
	print('Split train:', len(X_train), 'Baseline: %7.5f' % train_baseline)
	print('Split valid:', len(X_valid), 'Baseline: %7.5f' % valid_baseline)

	model_file = join(MODEL_FOLDER, 'nn_fold_%02d.model' % fold)
	callbacks = [
		EarlyStopping(monitor='val_loss', patience=5, verbose=0),
		ModelCheckpoint(model_file, save_best_only=True, save_weights_only=True)
	# fits the model on batches with real-time data augmentation:
	history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
			samples_per_epoch=len(X_train), nb_epoch=nb_epoch,
			validation_data=(X_valid, Y_valid),


	# Load best epoch

	predictions_valid = model.predict(X_valid, batch_size=batch_size, verbose=2)
	score = log_loss(Y_valid, predictions_valid)
	print('Score log_loss: ', score)
	sum_score += score*len(valid_index)

	# Store valid predictions
	for i in range(len(valid_index)):
		yfull_train[valid_index[i]] = predictions_valid[i]

	print('Start test KFold number {} from {}'.format(num_fold, num_folds))
	test_prediction = model.predict(test_data, batch_size=batch_size, verbose=2)
	del history
	del model

score = sum_score/len(train_data)
print("Log_loss train independent avg: ", score)

(John Lundberg) #8

I had a similar issue with theano. I am not sure how it relates to tensorflow, if it could be something similar.

I had the problem when using the libgpuarray backend, when I changed device configuration in .theanorc from cudo to gpu, keras and theano released memory when I called gc.collect().

Someone said elsewhere that theano can be slow to release the references and recommended to call gc.collect() a couple of times,

for i in range(3): gc.collect()

(prateek2686) #9

I ran into similar problems. I found that deleting the model variable and calling the gc.collect() helped. The behaviour of GPU memory getting freed still seems somewhat random. At times I had to call gc.collect() 12-15 times to free memory.

(siavash mortezavi) #10

I used these as well to help with his issue:

fuser -v /dev/nvidia*
sudo pkill -f ipykernel

(Musa Atlıhan) #11

from keras import backend as K

(after you are done with the model)


solves the memory issue.

(Abhishek Jain) #13

Thanks a lot.

(Yadunandan Vivekanand Kini) #14

could you please tell me when and where I should include the above line ? after model.compile() or after model.predict() ?

(Musa Atlıhan) #15

Yadu, please first don’t implement any K.clear_session() line, just train your model. Then when your training process is finished, if you are going to compose a new model just run K.clear_session() to clear the memory. Or you can use each model with a graph:

import tensorflow as tf

model1 = (load/define first model)
graph1 = tf.get_default_graph()

model2 = (load/define the second model)
graph2 = tf.get_default_graph()

with graph1.as_default():
    model1.predict(X) (or training)

with graph2.as_default():
    model2.predict(X) (or training)

(Yadunandan Vivekanand Kini) #16

Thanks :slight_smile:


Prevents tensorflow from using up the whole gpu

import tensorflow as tf
config = tf.ConfigProto()
sess = tf.Session(config=config)

This code helped me to come over the problem of GPU memory not releasing after the process is over. Run this code at the start of your program. Thanks.