SaveModelCallback in distributed training

I followed the tutorial in and converted my notebook to work with multiple GPUs. One thing I noticed is that at the end of every epoch, my model is validated on all the GPUs and the absolute best model might be overwritten. I tried to name the best model differently by appending the gpu id to the model name but it seems that only the model on device 0 is saved. Is there any example on how to save the best model when working with multiple GPUs?