Where you able to solve for this ?I am getting sigkills when working with train size >10mm,I know on pytorch forums there are issues open which looks to be about shared memory usage allocation while using >0 num_workers.
I tried increasing shmax but still not able to use multiprocessing when number of iterations in an epoch is large