As I understand it, current best practices for DL is to take the largest batch size you can fit into GPU memory. Is there a fast way to see how much memory a databunch will take up without retroactively changing the batch size after the model is trained?