Weird behaviour of Learner

Hello, I am trying to implement the Lottery Ticket Hypothesis in pytorch+fastai. The backward_hooks are still pretty broken so I went with register_hook for freezing the weights by setting the gradients to zero. But even with that I saw non-zero weights after pruning.

It was later that I realized that the model is initialized just before updates to the masked layers are reinitialized (still not sure about it) , killing a lot of progress. But what’s weirder is that even when the incoming gradient is zero. The weights still change irrespective of how small or large the value of weights are. Almost as if some percentage is changed. Below is the code.


def random_mask(model):
    mask_dict = {}
    for name, module in model.named_modules():
        if 'linear' in name:
            size = tuple(module.weight.shape)
            mask_dict[name] = (torch.randint(0, 2, size).bool().to(device='cuda'))
    return mask_dict        

def apply_mask(model, mask_dict):
    for name, module in model.named_modules():
        if 'linear' in name:
   *= mask_dict[name]
            print('module name is:', name, 'and weight size is:', module.weight.size())
            print('correspoding tensor is:', mask_dict[name].shape)
            #module.weight.register_hook(lambda x: x*mask_dict[name]) 
### Registering backward hook in each layer     
    model.linear1.weight.register_hook(lambda x: x*mask_dict['linear1'])
    model.linear2.weight.register_hook(lambda x: x*mask_dict['linear2'])
    model.linear3.weight.register_hook(lambda x: x*mask_dict['linear3'])

Is there anyway I can turn off the Learner object initialization? Or some other way so that the applied weights are not initialized.