Great, thanks! I’ll take a look.
I was playing around earlier today and noticing that training was taking a really long time (using 1 GPU, just to get a sense of things). It seems that the data loading is a bottleneck. I went looking around and found someone pointing to the tensorpack DataFlow library as a faster alternative to the PyTorch dataloaders. I think I’m going to try it out and see if it makes a difference.