How could I release gpu memory of keras


#1

Training models with kcross validation(5 cross), using tensorflow as back end.

Every time the program start to train the last model, keras always complain it is running out of memory, I call gc after every model are trained, any idea how to release the memory of gpu occupied by keras?

for i, (train, validate) in enumerate(skf):
	model, im_dim = mc.generate_model(parsed_json["keras_model"], parsed_json["custom_model"], parsed_json["top_model_index"], parsed_json["learning_rate"])
	training_data = _generate_train_data(im_dim, parsed_json["train_dir"], int(parsed_json["train_batch_size"]))
	testing_data = _generate_train_data(im_dim, parsed_json["val_dir"], int(parsed_json["test_batch_size"]))

	checkpoint = ModelCheckpoint(parsed_json["trained_model_name"] + "cross" + str(index) + ".h5", monitor='val_loss', verbose=0, 
                                 save_best_only=True, save_weights_only=False, mode='auto', period=1)
	index = index + 1
	callback_list = [checkpoint]
	start_time = time.time()

	training_history = model.fit_generator(training_data,
                                           samples_per_epoch=len(training_data.classes),
                                           nb_epoch = int(parsed_json["nb_epoch"]),
                                           validation_data=testing_data,
                                           nb_val_samples=parsed_json["nb_val_samples"],
                                           callbacks = callback_list)

	gc.collect() //cross my finger and beg gc can help me release memory

#2

Is your last model the exact same as previous models? Or is it a different architecture?


#3

same model, change nothing, to help gc release, I add one more line before I call it

model = None
gc.collect()

Any recommended way to force python release memory? I search by google, some posts say gc.collect() will force gc of python release non-reference memory, but it do not work in this case, wonder what happen, do keras allocate memory globally?


#4

Try del model That should immediately remove the model variable from memory.


#5

Thanks, I try this

del model
gc.collect()

Still have no luck. do keras got memory leak issue?


#6

Can you copy and paste the error message here?


(Vitor A Batista) #7

I was having exactly same issue. I solved by also deleting the training_history variable. Here is my piece of code that doesn’t have memory leakage. Looks like training_history keeps some pointer to GPU memory.

for fold, (train_clusters_index, valid_clusters_index) in enumerate(kf):
	model = create_model(TRAINABLE_VGG, include_top=INCLUDE_TOP)

	train_fold_id = folds.index[folds.isin(clusters[train_clusters_index])]
	valid_fold_id = folds.index[folds.isin(clusters[valid_clusters_index])]

	train_index = np.where(np.in1d(train_id, train_fold_id))[0]
	valid_index = np.where(np.in1d(train_id, valid_fold_id))[0]

	if SAMPLE < 1:
		train_index = subsample(train_index, SAMPLE)
		valid_index = subsample(valid_index, SAMPLE)

	X_train = train_data[train_index]
	Y_train = train_target[train_index]
	X_valid = train_data[valid_index]
	Y_valid = train_target[valid_index]

	base_model = Y_train.mean(axis=0)
	train_baseline = log_loss(Y_train, np.tile(base_model, len(Y_train)).reshape(-1, (len(base_model))))
	valid_baseline = log_loss(Y_valid, np.tile(base_model, len(Y_valid)).reshape(-1, (len(base_model))))

	num_fold += 1
	print('Start KFold number {} from {}'.format(num_fold, num_folds))
	print('Split train:', len(X_train), 'Baseline: %7.5f' % train_baseline)
	print('Split valid:', len(X_valid), 'Baseline: %7.5f' % valid_baseline)

	model_file = join(MODEL_FOLDER, 'nn_fold_%02d.model' % fold)
	callbacks = [
		EarlyStopping(monitor='val_loss', patience=5, verbose=0),
		ModelCheckpoint(model_file, save_best_only=True, save_weights_only=True)
	]
	# fits the model on batches with real-time data augmentation:
	history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
			samples_per_epoch=len(X_train), nb_epoch=nb_epoch,
			verbose=2,
			validation_data=(X_valid, Y_valid),
			callbacks=callbacks)

	all_history.append(history)

	# Load best epoch
	model.load_weights(model_file)

	predictions_valid = model.predict(X_valid, batch_size=batch_size, verbose=2)
	score = log_loss(Y_valid, predictions_valid)
	print('Score log_loss: ', score)
	sum_score += score*len(valid_index)

	# Store valid predictions
	for i in range(len(valid_index)):
		yfull_train[valid_index[i]] = predictions_valid[i]

	print('Start test KFold number {} from {}'.format(num_fold, num_folds))
	test_prediction = model.predict(test_data, batch_size=batch_size, verbose=2)
	yfull_test.append(test_prediction)
	del history
	del model
	gc.collect()

score = sum_score/len(train_data)
print("Log_loss train independent avg: ", score)

(John Lundberg) #8

I had a similar issue with theano. I am not sure how it relates to tensorflow, if it could be something similar.

I had the problem when using the libgpuarray backend, when I changed device configuration in .theanorc from cudo to gpu, keras and theano released memory when I called gc.collect().

Someone said elsewhere that theano can be slow to release the references and recommended to call gc.collect() a couple of times,

for i in range(3): gc.collect()


(prateek2686) #9

I ran into similar problems. I found that deleting the model variable and calling the gc.collect() helped. The behaviour of GPU memory getting freed still seems somewhat random. At times I had to call gc.collect() 12-15 times to free memory.


(siavash mortezavi) #10

I used these as well to help with his issue:

fuser -v /dev/nvidia*
sudo pkill -f ipykernel


(Musa Atlıhan) #11

from keras import backend as K

(after you are done with the model)

K.clear_session()

solves the memory issue.


(Abhishek Jain) #13

Thanks a lot.


(Yadunandan Vivekanand Kini) #14

could you please tell me when and where I should include the above line ? after model.compile() or after model.predict() ?


(Musa Atlıhan) #15

Yadu, please first don’t implement any K.clear_session() line, just train your model. Then when your training process is finished, if you are going to compose a new model just run K.clear_session() to clear the memory. Or you can use each model with a graph:

import tensorflow as tf

model1 = (load/define first model)
graph1 = tf.get_default_graph()

model2 = (load/define the second model)
graph2 = tf.get_default_graph()

with graph1.as_default():
    model1.predict(X) (or training)

with graph2.as_default():
    model2.predict(X) (or training)

(Yadunandan Vivekanand Kini) #16

Thanks :slight_smile:


#17

Prevents tensorflow from using up the whole gpu

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

This code helped me to come over the problem of GPU memory not releasing after the process is over. Run this code at the start of your program. Thanks.


(Jason) #18

I had the same problem. I created this function which works every time now. It releases the GPU memory. I call these before each run.

from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow

# Reset Keras Session
def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tensorflow.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tensorflow.Session(config=config))