I am trying to setup a lab for GPU intensive computing at my college.The lab has around 60 PC’s all with Nvidia graphics processor. I want to setup a cluster of these PC’s for Deep learning. I came across scaling using docker and kubernetes but I am not sure that this kind of deployment would be needed in my case. Can anyone direct me in the right direction
You could check this article by Tim Dettmers:
Thanks for helping, the approach in the article is nice but It uses RDMA drivers which only supports tesla and quadro series GPU. I have geforce cards. Is there any other way.
AFAIK, no.