Have you looked at GPU utilization? Is it low? See my comment here:
If you see GPU utilization being low and CPU utilization being high, it’s easier to believe the training is CPU-bound.
Have you looked at GPU utilization? Is it low? See my comment here:
If you see GPU utilization being low and CPU utilization being high, it’s easier to believe the training is CPU-bound.