Implementing a Normalization layer that works with Half Precision?

class multi(nn.Module):
    def __init__(self,num):
        super(multi, self).__init__()
        self.num=nn.Parameter(torch.Tensor(num))
    def forward(self,x):
        return x*self.num
    def
model = nn.Sequential(
    nn.Conv2d(3,10,3),
    multi(1),
    nn.Conv2d(10,10,3)
)
for layer in model:
    if(not isinstance(layer,multi)):
        layer.half()
print(next(model[0].parameters()).dtype,next(model[1].parameters()).dtype)
model=model.cuda()
test=torch.rand([1,3,64,64],dtype=torch.float16,device=default_device())
print(model(test).dtype)
Error: RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

I made a simple error throwing example above, in it multi takes in a float16 and then outputs a float32, therefore it fails on the next Conv2d. I am trying to edit a normalization layer, so that it is fp16 compatible like nn.BatchNorm. Batchnorm works fine, taking in a fp16 and outputing a fp16 even though the parameters are fp32:
test=torch.rand([1,3,64,64],dtype=torch.float16,device=default_device())
nn.BatchNorm2d(3).cuda()(test).dtype ##torch.float16
Can provide more information as needed.