Hi Fellas. I just wanted to ask you if you had the occasion of making experiments with RTX consumer card and the NVlink.
There are a lot of articles about performance comparisons with and without NVlink, but AFAIK they are all about data parallelism (e.g. Pytorch’s DataParallel): they show very modest speedups with NVlink w.r.t. the pci-express bus.
However, true model parallelism (the graph is fragmented across multiple gpus) would be more interesting when it comes to NVlink, particularly if the bridge allows to pool the memory, thus obtaining a single, ‘big’ GPU from the point of view of the model.