Model taking 36+ minutes per epoch on my System while GPU is also runnning at max capacity

I am learning from the fastbook along with the course Part 1. I was studying chapter 7 of the book about SOTA models and TTA. I went through the theory and then tried to implement the code. But when I try to implement the model, which is basically training a model from scratch, each epoch is taking more than 36+ minutes.

I have added the model’s first epoch time and the GPU utilization (So CUDA must be working properly).
My system is an Asus A-17 with AMD Ryzen 5 4600H and 1650Ti 4GB laptop version.

I tried the same thing on Kaggle using the T4x2 GPU where it took only 3-4 minutes each epoch. What might be the issue here and how do I make computation faster?

From RAM usage I roughly infer the model to be 12GB.

The 16GB of the T4 fits the whole model.
The 4GB of the 1650Ti means there will be much swapping between GPU and system memory.

Try a performance comparison with model that easily fits the smaller VRAM.



I switched from xresnet50 to xse_resnet18 and the training speed is now approx 4 minutes per epoch. Thanks for the help.

1 Like

Runtime Environement > Change Runtime Type > HardWare Acceleration > GPU > Restart notebook