Is there a way to control what GPUs fast.ai uses memory from in multi-GPU machines?
What I’m seeing is fast.ai completely ignoring
os.environ["CUDA_VISIBLE_DEVICES"] = "3" to run on GPU 0, and only partially obeying
torch.cuda.set_device(3), taking about 1 GB of memory on GPU 0 despite supposedly only being allowed to run on GPU 3 where it uses about 8 GB out of the 32 GB available. Is it a problem with CUDA version 11?
To compliment matters it seems a colleague of mine who’s also running on the same CUDA version doesn’t have this problem.