Hi friends, I’ve been experimenting fastai.text module for sometime, but maybe this question of mine still sounds a bit silly for you… so if in Sequential RNN, it resets weights of its Encoder trunks, then why bother using transfer learning at the first place then?
It doesn’t reset the weights. It resets the current hidden state. So every time the model processes a new batch/sequence, it starts with a fresh hidden state instead of using the hidden state from the previous batch/sequence.
It resets the hidden states, not the weights. I learned that the hard way. After training a model for 5h I realized my code wasn’t calling reset. What happened? My predictions were all the same.