Fine-tuning VGG taking very long

Yeah I’m already using htop and the issue seems to be that the output of all the batches is being stored in memory instead of being returned batch-by-batch.

Will work on breaking up the calc and see if that helps…

Seems like predict_generator would be written for the explicit purpose of not eating up all your memory. I haven’t looked at the implementation but it should be something like

  • load a batch of test cases into memory on the GPU.
  • make and store predictions on the entire batch (predictions are stored in a relatively small numpy array)
  • remove all references to the batch allowing the python garbage collector to reclaim the used memory.

Remember that when you create an ImageDataGenerator in Keras you determine the batch_size and therefore help control the GPU and system memory used during the call to predict_generator.

predict takes an entire “batch” in the sense that it runs predict on everything you give it all at once (think of batch processing in computer science).

Any chance there’s something wrong with how you installed cudNN?

1 Like

Watch this part of the section 10 video if you’re still having issues.

EDIT: Looks like there are some embedding issues while Part 2 is still unofficial. It can be viewed directly on Youtube, timestamp is 57:26

1 Like

Awesome, thanks! Will give it a go and report back…

I just spent a few days messing with notebooks in part1 and part2 on Google Compute Cloud with both Tensorflow and Theano as backends. I was playing with a bunch of settings trying to speed things up. Any chance you doubled the batch_size for the validation data? This is what the notebook does and I replicated it when writing my notebook.

batches = vgg.get_batches(train_path, batch_size=batch_size)
val_batches = vgg.get_batches(valid_path, batch_size=batch_size*2)

With Theano as a backend when I’d get to the very end of one epoch and Keras/Theano started processing the validation set I’d get an out of memory error. With Tensorflow it blew up immediately.

With Theano I was able to get to a batch_size of 128 on a 12G Telsa. Doubling that blows things up.

1 Like