Lesson 06 multicat: to_cpu

yrodriguezmd · July 26, 2021, 9:59pm

In the Lesson 06_multicat, at the start of the development for the Learner code, a to_cpu function was passed:
Screen Shot 2021-07-26 at 5.56.17 PM

May I ask for the rationale for this please? I’m a DL beginner, coming from a non-CS domain.

Thank you!
Maria

GoofyMango · July 26, 2021, 11:09pm

Hi @yrodriguezmd,

The to_cpu function just moves the x and y tensors to live on the CPU. PyTorch tensors can live on the CPU or the GPU. PyTorch models can also be either on the CPU or GPU. Both the model and the tensors you’re calling it on have to be on the same device for it to work.

We can check if a tensor is on the CPU or GPU by checking its device attribute. cpu means it’s on the CPU and cuda means it’s on the GPU.

Here are some screenshots breaking down the book example and explaining things more:

Hope this helps!

Brandon

yrodriguezmd · July 26, 2021, 11:39pm

Dear Brandon,

Thank you for the explanation, I have a better idea now on device implications!

Follow-up questions:

I do not see any other code that pertains to the location of the model. Is the model always defaulted to the CPU?
How is this cpu vs cuda location affected if I change my Colab Runtime type (CPU to GPU for the runtime accelerator)?

Thank you!

Maria

GoofyMango · July 27, 2021, 5:29pm

Hi Maria,

I’m not too sure about these ones haha.

I’m not sure. At least in that lesson, the model was originally on the CPU. I think it probably gets moved to the GPU when you start training, but I’m not sure if it gets moved back to the CPU after training is complete.
I think if you set your runtime to CPU, then my guess is that everything would always be on the CPU. In that case, maybe calling .cuda() would return an error or just not work. I’m not sure.

I encourage you to try out these things to see what happens. That’s what I did to solidify my knowledge about the .cpu() and .cuda() methods to answer your earlier question

Brandon

yrodriguezmd · July 27, 2021, 7:22pm

Thank you!

yrodriguezmd · August 5, 2021, 4:42pm

Found the answer to follow-up question1:

On Lesson 13 Convolutions, it says that fastai puts data on the GPU when using datablocks, by default.