How could I release gpu memory of keras


#1

Training models with kcross validation(5 cross), using tensorflow as back end.

Every time the program start to train the last model, keras always complain it is running out of memory, I call gc after every model are trained, any idea how to release the memory of gpu occupied by keras?

for i, (train, validate) in enumerate(skf):
	model, im_dim = mc.generate_model(parsed_json["keras_model"], parsed_json["custom_model"], parsed_json["top_model_index"], parsed_json["learning_rate"])
	training_data = _generate_train_data(im_dim, parsed_json["train_dir"], int(parsed_json["train_batch_size"]))
	testing_data = _generate_train_data(im_dim, parsed_json["val_dir"], int(parsed_json["test_batch_size"]))

	checkpoint = ModelCheckpoint(parsed_json["trained_model_name"] + "cross" + str(index) + ".h5", monitor='val_loss', verbose=0, 
                                 save_best_only=True, save_weights_only=False, mode='auto', period=1)
	index = index + 1
	callback_list = [checkpoint]
	start_time = time.time()

	training_history = model.fit_generator(training_data,
                                           samples_per_epoch=len(training_data.classes),
                                           nb_epoch = int(parsed_json["nb_epoch"]),
                                           validation_data=testing_data,
                                           nb_val_samples=parsed_json["nb_val_samples"],
                                           callbacks = callback_list)

	gc.collect() //cross my finger and beg gc can help me release memory

#2

Is your last model the exact same as previous models? Or is it a different architecture?


#3

same model, change nothing, to help gc release, I add one more line before I call it

model = None
gc.collect()

Any recommended way to force python release memory? I search by google, some posts say gc.collect() will force gc of python release non-reference memory, but it do not work in this case, wonder what happen, do keras allocate memory globally?


#4

Try del model That should immediately remove the model variable from memory.


#5

Thanks, I try this

del model
gc.collect()

Still have no luck. do keras got memory leak issue?


#6

Can you copy and paste the error message here?


(Vitor A Batista) #7

I was having exactly same issue. I solved by also deleting the training_history variable. Here is my piece of code that doesn’t have memory leakage. Looks like training_history keeps some pointer to GPU memory.

for fold, (train_clusters_index, valid_clusters_index) in enumerate(kf):
	model = create_model(TRAINABLE_VGG, include_top=INCLUDE_TOP)

	train_fold_id = folds.index[folds.isin(clusters[train_clusters_index])]
	valid_fold_id = folds.index[folds.isin(clusters[valid_clusters_index])]

	train_index = np.where(np.in1d(train_id, train_fold_id))[0]
	valid_index = np.where(np.in1d(train_id, valid_fold_id))[0]

	if SAMPLE < 1:
		train_index = subsample(train_index, SAMPLE)
		valid_index = subsample(valid_index, SAMPLE)

	X_train = train_data[train_index]
	Y_train = train_target[train_index]
	X_valid = train_data[valid_index]
	Y_valid = train_target[valid_index]

	base_model = Y_train.mean(axis=0)
	train_baseline = log_loss(Y_train, np.tile(base_model, len(Y_train)).reshape(-1, (len(base_model))))
	valid_baseline = log_loss(Y_valid, np.tile(base_model, len(Y_valid)).reshape(-1, (len(base_model))))

	num_fold += 1
	print('Start KFold number {} from {}'.format(num_fold, num_folds))
	print('Split train:', len(X_train), 'Baseline: %7.5f' % train_baseline)
	print('Split valid:', len(X_valid), 'Baseline: %7.5f' % valid_baseline)

	model_file = join(MODEL_FOLDER, 'nn_fold_%02d.model' % fold)
	callbacks = [
		EarlyStopping(monitor='val_loss', patience=5, verbose=0),
		ModelCheckpoint(model_file, save_best_only=True, save_weights_only=True)
	]
	# fits the model on batches with real-time data augmentation:
	history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
			samples_per_epoch=len(X_train), nb_epoch=nb_epoch,
			verbose=2,
			validation_data=(X_valid, Y_valid),
			callbacks=callbacks)

	all_history.append(history)

	# Load best epoch
	model.load_weights(model_file)

	predictions_valid = model.predict(X_valid, batch_size=batch_size, verbose=2)
	score = log_loss(Y_valid, predictions_valid)
	print('Score log_loss: ', score)
	sum_score += score*len(valid_index)

	# Store valid predictions
	for i in range(len(valid_index)):
		yfull_train[valid_index[i]] = predictions_valid[i]

	print('Start test KFold number {} from {}'.format(num_fold, num_folds))
	test_prediction = model.predict(test_data, batch_size=batch_size, verbose=2)
	yfull_test.append(test_prediction)
	del history
	del model
	gc.collect()

score = sum_score/len(train_data)
print("Log_loss train independent avg: ", score)

(John Lundberg) #8

I had a similar issue with theano. I am not sure how it relates to tensorflow, if it could be something similar.

I had the problem when using the libgpuarray backend, when I changed device configuration in .theanorc from cudo to gpu, keras and theano released memory when I called gc.collect().

Someone said elsewhere that theano can be slow to release the references and recommended to call gc.collect() a couple of times,

for i in range(3): gc.collect()


(prateek2686) #9

I ran into similar problems. I found that deleting the model variable and calling the gc.collect() helped. The behaviour of GPU memory getting freed still seems somewhat random. At times I had to call gc.collect() 12-15 times to free memory.


(siavash mortezavi) #10

I used these as well to help with his issue:

fuser -v /dev/nvidia*
sudo pkill -f ipykernel


(Musa Atlıhan) #11

from keras import backend as K

(after you are done with the model)

K.clear_session()

solves the memory issue.


(Abhishek Jain) #13

Thanks a lot.