What does accelerate library do to my model when I train with 2 GPUs?


I just made my kaggle notebook work on the 2 T4 GPUs using accelerate library.
What I don’t understand right now is how this is helpful compared to running my model on only 1 GPU.
My mIoU metric is not working well because I didn’t manage to fix it yet so I don’t know if I will see some improvement there.

I ran a couple of experiments using 2000 images in training dataloader, 250 in validation dataloader (nvidia dali dataloaders btw), batch_size=64
start = time.time()
notebook_launcher(training_loop, args, num_processes=1)
print("Total duration: ", time.time() - start)

Result: 147 seconds; 6.5 GB RAM on CPU; 9.7 GB on GPU

start = time.time()
notebook_launcher(training_loop, args, num_processes=2)
print("Total duration: ", time.time() - start)

Result: 310 seconds; 10.5 GB RAM on CPU; 10 GB on both GPUs

I had expectations that maybe accelerate will split my model across the 2 GPUs so that I will be able to make it bigger, but it seems that’s not the case…