Precomputation - VRAM memory filled up

s_j · December 4, 2017, 5:37am

Hi,

I am doing the statefarm problem. Some code snippets are as follows:

test_batches, num_test_batches = get_batches('./data/test', gen=datagen,
                                                shuffle=False, batch_size=batch_size * 2)
vgg_model = VGG16(include_top=False, weights='imagenet')
bottleneck_test_data = vgg_model.predict_generator(test_batches, num_test_batches, verbose=1)
np.savez('bottleneck/VGG16TestData',
         bottleneck_test_data=bottleneck_test_data)

While using transfer learning with the VGG16 model, one needs to precompute the data in the train, validation and test sets. Usually, when we precompute the data, the output of the VGG16 model is of the entire data and resides in the VRAM. Because it takes a lot of time, it is better to store it as an .npy file for easier retrieval. But when the dataset is extremely large running the np.savez command becomes unviable as the VRAM gets almost filled up ? I know the obvious method would be is to go for a higher configuration. But is there another alternative?

harveyslash · December 4, 2017, 7:58am

All if the images are never in the vram. It’s sent to the GPU in batches. If you’re getting out of memory error then try reducing the batch size.

s_j · December 4, 2017, 8:59am

I have just edited the question. Please have a look

harveyslash · December 5, 2017, 12:44am

I am not sure what num batches is.
you should try to set the value of batch size to 1.
if you still get out of memory errors, the only thing you can do is reduce the dimensions of your images.

This is going to be cumbersome, but you could put in different parts of the model in GPU , extract output ,and use that output as input. Although even this will probably fail because during the middle layers, you will have very large size features.