Thoughts on using Resnet-type layers for structural NN?

cadolphs · November 7, 2018, 10:46pm

In Part 1, Lesson 7, we’re introduced to the ResNet idea, and I like how @jeremy explains that it’s basically boosting.
Now, boosting works great for Random Forests, which in turn work great on structured data. Thus I was thinking if anyone has tried enhancing the simple tabular learner in the fastai library with something that uses a ResNet layer?
Here is what I tried:

class ResBlock(nn.Module):

    def __init__(self, ni:int):
        super().__init__()
        self.lin = nn.Linear(ni, ni)
        self.bn_layer = nn.BatchNorm1d(ni)

    def forward(self, x):
        x = F.relu(self.lin(x))
        return x + self.bn_layer(x)

def res_drop_lin(n_in:int, n_out:int, bn:bool=True, p:float=0., actn:Optional[nn.Module]=None):
    "`n_in`->bn->dropout->linear(`n_in`,`n_out`)->`actn`"
    layers = [ResBlock(n_in)] if bn else []
    if p != 0: layers.append(nn.Dropout(p))
    layers.append(nn.Linear(n_in, n_out))
    if actn is not None: layers.append(actn)
    return layers

And then just replace the bn_drop_lin call in TabularModel with res_drop_lin.

Basically what this does in contrast to the simple Tabular model is that instead of just doing a batch norm, we run the input through a linear layer and an activation (relu).
I tried to see if it makes a difference in the fastai v1 example (the adult dataset) but I guess since we’re just using two layers there the resnet idea can’t quite shine.

What do people think about this Something worth digging deeper or are there obvious reasons for why it won’t do much good?