Solved - Does windows create a CPU bottleneck?

bdubreu · October 25, 2019, 12:42pm

Hi everybody,

I’m trying to run an efficientnet on my local rtx 2060 gpu. But a single epoch takes several hours. I suspect the multiprocessing might fail, because I think I remember reading something about this happening on Windows.

Also, here are some infos:

As you can see, the CPU (‘processeur’ in french) is used more than 50% (and that can go up to 100% if I put my laptop in ‘sport’ mode, and then the epoch time decreases) but the GPU is used 0%…

Several questions:

Can I trust these stats ?
If yes, apparently the CPU is the bottleneck. What can I do ? Would install Ubunto solve this ? I’ve had no error telling me multiprocessing wasn’t working, but the hints are strong. What do you think ?

Note: my data is very large, but I’m using 112px images
I’ve only recently bought that stuff and I’m trying to get everything running here, so sorry if the question is noobish or something.

Thanks for your time !

Edit:

sgugger · October 25, 2019, 12:51pm

PyTorch multiprocessing doesn’t really work on Windows. This is an issue that has been reported multiple times, not sure if/how they fixed it. It definitely works on Ubuntu.

bdubreu · October 25, 2019, 12:57pm

That’s what I thought. So you think, despite the code showing no error, that the CPU is just passing the GPU one image at a time or something and that creates a bottleneck ?

Thank you for taking the time to answer despite the business (as in busy-ness) around the v2 !

sgugger · October 25, 2019, 1:03pm

It’s passing them as batches, as the code intends to, but the multiprocessing used to create those batches in parallel isn’t really working.

bdubreu · October 25, 2019, 1:09pm

Ok that makes sense. Thank you so much for your time !