Multi GPU Training - Tuning Parameters

If a model is running on multiple GPU, what are the parameters that I can tune to reduce training time.

  1. Is it ok to increase batch size?
  2. Is it ok to clip the gradient?
  3. Is it ok to increase number of epochs?
  4. Is it ok to increase learning rate?