Smallify: Learning network size while training

Hi fellow Alumni!
Thanks to Soumith Chintala I just came across this paper: https://arxiv.org/abs/1806.03723
It gives a clean and nice solution to learning network size by using SwitchLayers (layers that learn a parameter beta for each neuron, based on its performance contirbutions) after Dense/Conv layers (after BatchNorm according to the paper), explicitly pruning the network during training, resulting in a smaller (and less sparse) net at the end of the process.

The GitHub page seems to be this one: https://github.com/mitdbg/fastdeepnets produced by Guillaume Leclerc for his master thesis (amazing).

What do you think? On paper, it seems really a smart and simple way to tackle this problem!

8 Likes

I haven’t seen this paper, will take a look. Seems related to https://arxiv.org/pdf/1803.03635.pdf