@stas I want to follow up on your excellent discussion about GPU memory on the Pytorch forum. In short, I can’t quite replicate your result and I want to know where I have gone wrong. Do you have a script/notebook that shows this “use of what is free” on the first step of the build of the MNIST model? When I try to replicate (here is my gist) I seem to be getting use of memory beyond what is “free” when I start the learn epoch. That is, Pytorch seems to take the memory it needs. Maybe my method to use up the GPU memory is not right (torch.ones((n,n)).cuda()
? Any advice/suggestion is appreciated.