Hey there,
I’m a proud fastai user and I really love what you are doing for the community.
I have implemented a few text models using fastai 0.7 - ULMFiT.
As fastai 1.0 seems to be designed much better I’d like to migrate from 0.7 to 1.0.
Code wise it’s pretty straightforward.
I have problem with migrating the weights that I’ve trained in the past.
See below to understand the issue.
Old language had model definition:
SequentialRNN(
(0): RNN_Encoder(
(encoder): Embedding(30002, 400, padding_idx=1)
(encoder_with_dropout): EmbeddingDropout(
(embed): Embedding(30002, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDrop(
(module): LSTM(400, 1150)
)
(1): WeightDrop(
(module): LSTM(1150, 1150)
)
(2): WeightDrop(
(module): LSTM(1150, 400)
)
)
(dropouti): LockedDropout()
(dropouths): ModuleList(
(0): LockedDropout()
(1): LockedDropout()
(2): LockedDropout()
)
)
(1): LinearDecoder(
(decoder): Linear(in_features=400, out_features=30002, bias=False)
(dropout): LockedDropout()
)
)
While the new version:
SequentialRNN(
(0): RNNCore(
(encoder): Embedding(30002, 400, padding_idx=1)
(encoder_dp): EmbeddingDropout(
(emb): Embedding(30002, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDropout(
(module): LSTM(400, 1150, batch_first=True)
)
(1): WeightDropout(
(module): LSTM(1150, 1150, batch_first=True)
)
(2): WeightDropout(
(module): LSTM(1150, 400, batch_first=True)
)
)
(input_dp): RNNDropout()
(hidden_dps): ModuleList(
(0): RNNDropout()
(1): RNNDropout()
(2): RNNDropout()
)
)
(1): LinearDecoder(
(decoder): Linear(in_features=400, out_features=30002, bias=True)
(output_dp): RNNDropout()
)
)
I’ve noticed differences in WeightDropout
as it requires one more matrix - '0.rnns.0.module.weight_hh_l0'
- which is 0.rnns.0.weight_hh_l0_raw
after dropout. Also I’m not sure if LockedDropout
and RNNDropout
is the same thing.
My main focus here is to migrate language models as I’ve trained them on specific sources and it took days…
If it’s possible I’d like to migrate also classifiers, but I guess it’s not.
Old model’s head:
1): PoolingLinearClassifier(
(layers): ModuleList(
(0): LinearBlock(
(lin): Linear(in_features=1200, out_features=50, bias=True)
(drop): Dropout(p=0.48)
(bn): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): LinearBlock(
(lin): Linear(in_features=50, out_features=6, bias=True)
(drop): Dropout(p=0.1)
(bn): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
New model’s head:
(1): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.4)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=3, bias=True)
)
Here the order of the BN and Linear layer is different - so the weights should be also different.
I’m curious if any of you have a script to migrate language model?
All best,
Mateusz