How could I release gpu memory of keras


Training models with kcross validation(5 cross), using tensorflow as back end.

Every time the program start to train the last model, keras always complain it is running out of memory, I call gc after every model are trained, any idea how to release the memory of gpu occupied by keras?

for i, (train, validate) in enumerate(skf):
	model, im_dim = mc.generate_model(parsed_json["keras_model"], parsed_json["custom_model"], parsed_json["top_model_index"], parsed_json["learning_rate"])
	training_data = _generate_train_data(im_dim, parsed_json["train_dir"], int(parsed_json["train_batch_size"]))
	testing_data = _generate_train_data(im_dim, parsed_json["val_dir"], int(parsed_json["test_batch_size"]))

	checkpoint = ModelCheckpoint(parsed_json["trained_model_name"] + "cross" + str(index) + ".h5", monitor='val_loss', verbose=0, 
                                 save_best_only=True, save_weights_only=False, mode='auto', period=1)
	index = index + 1
	callback_list = [checkpoint]
	start_time = time.time()

	training_history = model.fit_generator(training_data,
                                           nb_epoch = int(parsed_json["nb_epoch"]),
                                           callbacks = callback_list)

	gc.collect() //cross my finger and beg gc can help me release memory


Is your last model the exact same as previous models? Or is it a different architecture?


same model, change nothing, to help gc release, I add one more line before I call it

model = None

Any recommended way to force python release memory? I search by google, some posts say gc.collect() will force gc of python release non-reference memory, but it do not work in this case, wonder what happen, do keras allocate memory globally?


Try del model That should immediately remove the model variable from memory.


Thanks, I try this

del model

Still have no luck. do keras got memory leak issue?


Can you copy and paste the error message here?

(Vitor A Batista) #7

I was having exactly same issue. I solved by also deleting the training_history variable. Here is my piece of code that doesn’t have memory leakage. Looks like training_history keeps some pointer to GPU memory.

for fold, (train_clusters_index, valid_clusters_index) in enumerate(kf):
	model = create_model(TRAINABLE_VGG, include_top=INCLUDE_TOP)

	train_fold_id = folds.index[folds.isin(clusters[train_clusters_index])]
	valid_fold_id = folds.index[folds.isin(clusters[valid_clusters_index])]

	train_index = np.where(np.in1d(train_id, train_fold_id))[0]
	valid_index = np.where(np.in1d(train_id, valid_fold_id))[0]

	if SAMPLE < 1:
		train_index = subsample(train_index, SAMPLE)
		valid_index = subsample(valid_index, SAMPLE)

	X_train = train_data[train_index]
	Y_train = train_target[train_index]
	X_valid = train_data[valid_index]
	Y_valid = train_target[valid_index]

	base_model = Y_train.mean(axis=0)
	train_baseline = log_loss(Y_train, np.tile(base_model, len(Y_train)).reshape(-1, (len(base_model))))
	valid_baseline = log_loss(Y_valid, np.tile(base_model, len(Y_valid)).reshape(-1, (len(base_model))))

	num_fold += 1
	print('Start KFold number {} from {}'.format(num_fold, num_folds))
	print('Split train:', len(X_train), 'Baseline: %7.5f' % train_baseline)
	print('Split valid:', len(X_valid), 'Baseline: %7.5f' % valid_baseline)

	model_file = join(MODEL_FOLDER, 'nn_fold_%02d.model' % fold)
	callbacks = [
		EarlyStopping(monitor='val_loss', patience=5, verbose=0),
		ModelCheckpoint(model_file, save_best_only=True, save_weights_only=True)
	# fits the model on batches with real-time data augmentation:
	history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
			samples_per_epoch=len(X_train), nb_epoch=nb_epoch,
			validation_data=(X_valid, Y_valid),


	# Load best epoch

	predictions_valid = model.predict(X_valid, batch_size=batch_size, verbose=2)
	score = log_loss(Y_valid, predictions_valid)
	print('Score log_loss: ', score)
	sum_score += score*len(valid_index)

	# Store valid predictions
	for i in range(len(valid_index)):
		yfull_train[valid_index[i]] = predictions_valid[i]

	print('Start test KFold number {} from {}'.format(num_fold, num_folds))
	test_prediction = model.predict(test_data, batch_size=batch_size, verbose=2)
	del history
	del model

score = sum_score/len(train_data)
print("Log_loss train independent avg: ", score)

(John Lundberg) #8

I had a similar issue with theano. I am not sure how it relates to tensorflow, if it could be something similar.

I had the problem when using the libgpuarray backend, when I changed device configuration in .theanorc from cudo to gpu, keras and theano released memory when I called gc.collect().

Someone said elsewhere that theano can be slow to release the references and recommended to call gc.collect() a couple of times,

for i in range(3): gc.collect()

(prateek2686) #9

I ran into similar problems. I found that deleting the model variable and calling the gc.collect() helped. The behaviour of GPU memory getting freed still seems somewhat random. At times I had to call gc.collect() 12-15 times to free memory.

(siavash mortezavi) #10

I used these as well to help with his issue:

fuser -v /dev/nvidia*
sudo pkill -f ipykernel

(Musa Atlıhan) #11

from keras import backend as K

(after you are done with the model)


solves the memory issue.

(Abhishek Jain) #13

Thanks a lot.