Tabular Transfer Learning and/or retraining with fastai

Hi Sylvain, I’ve did a lot of progress on the tabular transfer learning. However, there are significant differences between text, vision and tabular in terms of layers. I would like to know if I need to transfer more than the embeds in the module list…

In fast.ai text, the function load_pretrained() contains several elements we are transferring from the old state_dict() to the new state_dict() :

  • 0.encoder.weight
  • 1.decoder.bias’
  • 1.decoder.weight

We get those, for example, through:
dec_bias, enc_wgts = wgts.get('1.decoder.bias', None), wgts['0.encoder.weight']

On the Adult Dataset Tabular Example, here are the layers I get from state_dict . We can see that they do not match the layers.bias llike in text :

embeds.0.weight
embeds.1.weight
embeds.2.weight
embeds.3.weight
embeds.4.weight
embeds.5.weight
embeds.6.weight
embeds.7.weight
embeds.8.weight
bn_cont.weight
bn_cont.bias
bn_cont.running_mean
bn_cont.running_var
bn_cont.num_batches_tracked
layers.0.weight
layers.0.bias
layers.2.weight
layers.2.bias
layers.2.running_mean
layers.2.running_var
layers.2.num_batches_tracked
layers.3.weight
layers.3.bias
layers.5.weight
layers.5.bias
layers.5.running_mean
layers.5.running_var
layers.5.num_batches_tracked
layers.6.weight
layers.6.bias

TabularModel(
(embeds): ModuleList(
(0): Embedding(10, 6)
(1): Embedding(17, 8)
(2): Embedding(17, 8)
(3): Embedding(8, 5)
(4): Embedding(16, 8)
(5): Embedding(7, 5)
(6): Embedding(6, 7)
(7): Embedding(3, 3)
(8): Embedding(43, 10)
)
(emb_drop): Dropout(p=0.0)
(bn_cont): BatchNorm1d(5, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(layers): Sequential(
(0): Linear(in_features=65, out_features=200, bias=True)
(1): ReLU(inplace)
(2): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Linear(in_features=200, out_features=100, bias=True)
(4): ReLU(inplace)
(5): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Linear(in_features=100, out_features=2, bias=True)
)
)

And the layers bias do not match at all any structure, for example :
‘layers.6.bias’, tensor([ 0.1803, -0.2174]

So again my question would be, which other layers do I need to transfer ?

Will share my code as soon as I’ve fully rewritten functions !

1 Like