Pretrain & Finetune MLM with fastai 5: Multi-task Learning using fastai

I created MultiTaskLearner that

  • Don’t need to creat a single dls for all tasks’ dataloading

  • Don’t need to hack model to output concatenated outputs for all tasks

  • Of course is general but not limited to training GLUE tasks.

  • Prints results beautifully. :heart_eyes: :heart_eyes: :heart_eyes:

Finetune_GLUE_with_fastai.ipynb: shows hot to use MultiTaskLearner define MultiTaskLearner and other classes

Notes for Mechanism:

  1. cycle

  2. If you don’t just switch output layer of model for your multi-task learning
    Just create your own nn.module class MultiTaskModel and define switch and __len__ methods, so that after model.switch(i), model(x) will use a internal model that can solve task i. It’s kind of like a bunch of models, implementation should be super easy.

Note for pasted results

  • I know I should post a result with multi-task training all GLUE tasks, but Colab told me out of memory and using vscode that connects to our lab server can’t print pretty training result sheet. Please tell me how to solve this dilema. :sob:

  • Actually Electra didn’t multi-task train GLUE tasks, which I learned after I created MultiTaskLearner. :joy: :joy: :joy:

  • It is worse than single task learning, it may due to need to adjust task weitghts / GLUE is not suited for multi-task learning.

“Pretrain MLM and fintune on GLUE with fastai”

Previous posts.

  1. MaskedLM callback and ELECTRA callback

  2. TextDataLoader - faster/as fast as, but also with sliding window, cache, and bar

  3. Novel Huggingface/nlp integration: train on and show_batch hf/nlp datasets

  4. Warm up & linearly decay lr shedule + discriminative lr

Also follow my twitter Richard Wang for updates of this series.

(I am going to reproduce the ELECTRA-small results in Table 8 of the paper, from scratch !!)