GPU Memory Trouble: Small batchsize under 16 with a GTX 1080

Used to be a cryptocoin miner with a spare GTX 1080 sitting around and built a rig (Ubuntu 16.04, 32 gig ram, intel 5 ). Made it through lesson 2, submitted Cats and Dogs redux but I keep getting memory errors.

On startup of a notebook and my most likely problem:
It states CudNN cant find the correct tmp files on the first run. However, the error clears the second time hence I keep thinking the error clears.

However later I run into memory errors if batch is above 16 and further on if its above 4. Here is an example with 64. I have tried all of the suggested error fixing(messing with optimizer= X) without success.

Then works fine if batchsize is 16. Completes and during the run GPU is at 99% utilization.

Other times I get a memory error completely out of the blue.

I have unsuccessfully tried:

  • upgrading/reinstalling CudNN (When I google error thats the most popular suggestion)
  • updating/reinstalling Nvidia drivers

Currently, i am just considering moving along and keep batch size under 16. Is there anything else I should try?

Your error message indicates that you are not using cudnn. Have you downloaded it and extracted it to your Cuda install directory? Not sure if it helps with memory issues though, but will help with performance in general.

You can download it here (choose 5.1):

1 Like

Thanks for the help! I did get further by correctly installing cudnn (installed to wrong directory). I did get a batch size of 32.

However, now im getting the following error with cuda or cuda0

It seems everything is training, and GPU utilization is at 99%. Is this an error or working correctly?

Google searches makes me thing Its ok

Not sure if Iā€™m missing something, but the output you showed is not an error?

I edited to clarify. Is the message still an error or working correctly?

Its not an error, just information :slight_smile:

1 Like