I have previously been running predictions on a system with dual GTX 1080 Ti GPUs. Predictions were done using a batch size of 3, using fp16. This configuration fit comfortably within the 11Gb of memory and would typically use a maximum of 9.5 Gb memory.
I now have cloned the software environment of that system on a new machine with dual 2080 Ti GPUs. When running predictions with the same input data and model as on the older system I now get an out of memory error.
RuntimeError: CUDA out of memory. Tried to allocate 1.22 GiB (GPU 1; 11.00 GiB total capacity; 6.97 GiB already allocated; 102.34 MiB free; 1.41 GiB cached)
So it seems there is some interplay with the driver and new card that is causing memory to be more fragmented, or at least less available than on the older GPUs.
I’m curious if anyone else has run across this issue or might have some suggestions or work-arounds.
Current driver is:
| NVIDIA-SMI 418.96 Driver Version: 418.96 CUDA Version: 10.1 |
So far I’ve already tried installing the latest Nvidia driver 425.25, but got the same results.
Full configuration info below:
=== Software === python : 3.7.2 fastai : 1.0.50.post1 fastprogress : 0.1.20 torch : 1.0.1 torch cuda : 10.0 / is available torch cudnn : 7401 / is enabled === Hardware === torch devices : 2 - gpu0 : GeForce RTX 2080 Ti - gpu1 : GeForce RTX 2080 Ti === Environment === platform : Windows-10-10.0.16299-SP0 conda env : fastai_v1 python : C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\python.exe sys.path : C:\Users\lproc C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\python37.zip C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\DLLs C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\lib C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1 C:\Users\lproc\AppData\Local\Continuum\anaconda3\envs\fastai_v1\lib\site-packages no nvidia-smi is found