Smallify: Learning network size while training

Hi fellow Alumni!
Thanks to Soumith Chintala I just came across this paper:
It gives a clean and nice solution to learning network size by using SwitchLayers (layers that learn a parameter beta for each neuron, based on its performance contirbutions) after Dense/Conv layers (after BatchNorm according to the paper), explicitly pruning the network during training, resulting in a smaller (and less sparse) net at the end of the process.

The GitHub page seems to be this one: produced by Guillaume Leclerc for his master thesis (amazing).

What do you think? On paper, it seems really a smart and simple way to tackle this problem!


I haven’t seen this paper, will take a look. Seems related to