I’m using ULMFit to fine tune a language model and text classifier. I’m noticing that with gradual unfreezing on the text classifier learner, accuracy starts low for the unfrozen layers. It’s almost like it’s starting to fine-tune again.
What is the expected behavior? Is it expected that accuracy dip until the unfrozen layers are fine-tuned? Or should accuracy always be increasing?
My code for training my text classifier (after fine tuning the language model) looks like this.
data_clas = TextClasDataBunch.from_df(path = “”, train_df = df_trn, valid_df = df_val, vocab=data_lm.train_ds.vocab, min_freq=1, bs=32)
data_clas.save()
clearn = text_classifier_learner(data_clas, arch=model, drop_mult=0.5) # get the learner
clearn.load_encoder(‘ft_enc’)
clearn.freeze()
clearn.purge()
torch.cuda.empty_cache()
clearn.fit_one_cycle(cyc_len=400, max_lr=1e-2, moms=(0.8, 0.7))
Accuracy starts out very low after the unfreeze below
torch.cuda.empty_cache()
clearn.freeze_to(-2)
clearn.fit_one_cycle(20, slice(1e-4,1e-2), moms=(0.8,0.7))
torch.cuda.empty_cache()
clearn.freeze_to(-3)
clearn.fit_one_cycle(20, slice(1e-5,5e-3), moms=(0.8,0.7))
torch.cuda.empty_cache()
clearn.unfreeze()
clearn.fit_one_cycle(500, slice(1e-5,1e-3), moms=(0.8,0.7))