ULMFiT train_clas training around 50% on IMDB


(Victor Sun) #1

I’ve been running imdb_scripts to reproduce ULMFiT results on IMDB data set (table 7 on the paper) and the pretrained LM and LM fine tuning steps worked fine. On training/fine tuning the classifier, the code worked for the full ULMFiT technique of freezing, discriminative tuning, and slanted triangular learning, getting accuracy of around 94 percent. However, on all the other techniques I’ve tried i.e. only freezing, only full model tuning, and training from scratch, I’m getting 50% validation error on a binary classification problem, so something is definitely not working properly. For the freezing only run, it started at 92 percent when only the last 2 layers were unfrozen. However, as soon as regular training started happening, the accuracy dropped to 50 percent. I think there may be a problem with saving the intermediate class file, but I’m unable to pinpoint the problem. I also had issues running the full model and from scratch. Since those did not employ freezing, they started at 50% immediately as regular schedule started and hovered around it. Also, with full model, I was not sure what to use for startat flag because setting = 0 would use freezing and setting = 1 would attempt to load a non initialized intermediate class file. I think it’s a similar issue for all 3 of these cases. Once it starts training with regular schedule, the model only trains with 50 % accuracy and is likely not using the loaded values from LM and intermediate class file.

I ran with the following parameters:

Freeze only:
dir_path data/imdb; cuda_id 0; lm_id pretrain_wt103; clas_id freeze_wt103; bs 64; cl 50; backwards False; dropmult 1.0 unfreeze True startat 0; bpe False; use_clr False;use_regular_schedule True; use_discriminative False; last False;chain_thaw False; from_scratch False; train_file_id ‘’

Full model only:
dir_path data/imdb; cuda_id 0; lm_id pretrain_wt103; clas_id full_wt103; bs 64; cl 50; backwards False; dropmult 1.0 unfreeze True startat 2; bpe False; use_clr False;use_regular_schedule True; use_discriminative False; last False;chain_thaw False; from_scratch False; train_file_id ‘’

From scratch:
dir_path data/imdb; cuda_id 0; lm_id ‘’; clas_id from_scratch; bs 64; cl 50; backwards False; dropmult 1.0 unfreeze True startat0; bpe False; use_clr False;use_regular_schedule True; use_discriminative False; last False;chain_thaw False; from_scratch True; train_file_id ‘’


(Victor Sun) #2

@sebastianruder