We are relying on pytorhc init of BatchNorm: from the source code

def reset_running_stats(self):
if self.track_running_stats:
self.running_mean.zero_()
self.running_var.fill_(1)
self.num_batches_tracked.zero_()
def reset_parameters(self):
self.reset_running_stats()
if self.affine:
init.uniform_(self.weight)
init.zeros_(self.bias)

So the running mean is initialized to 0, the running variance to 1, then the weight to uniform (probably between -1/sqrt(n_weights) and 1/sqrt(n_weights)) and the bias to 0.