Hi there, i have a quick question please, when i use AWD_LSTM in my language_model_learner, I save the encoder and then i use this classifier, classifier= text_classifier_learner (data_clas, Transformer, drop_mult=0.2)
it doesn’t work… and i gt this error :
Missing key(s) in state_dict: “pos_enc.weight”…
Unexpected key(s) in state_dict: “encoder_dp.emb.weight”…
size mismatch for encoder.weight: copying a param with shape torch.
Is it because they both have a different architecture? one is RNN and the second based on regular NN with attentions layers, or it is because of something else?