06_cuda_cnn_hooks_init.ipynb. Why model is not re-assigned while mini-batches are?

immarried · August 13, 2019, 7:00am

In 06_cuda_cnn_hooks_init.ipynb, in

class CudaCallback(Callback):
    def begin_fit(self): self.model.cuda()
    def begin_batch(self): self.run.xb,self.run.yb = self.xb.cuda(),self.yb.cuda()

Why isn’t self.model.cuda() assigned to self.run.model or self.run.learn.model, like self.xb.cuda() is to self.run.xb?

TomB · August 13, 2019, 8:46am

Well, it’s sort of a style thing, you could assign the model as it returns itself. But moving a model is inplace whereas moving a tensor is not. i.e.

>>> t = torch.tensor([1.,22.,3.], device='cpu')
>>> c = t.cuda()
>>> c is t
False
>>> m = nn.Linear(1, 1)
>>> cm = m.cuda()
>>> m is cm
True

This reflects the underlying operation. The model remains the same, it is just the parameters that are moved to the GPU, so calling m would run on the GPU just like cm. While in the case of the tensor t is still a valid CPU tensor.
So you have to update xb and yb to the CUDA tensors, you don’t have to update model.