I created MultiTaskLearner
that
-
Don’t need to creat a single
dls
for all tasks’ dataloading -
Don’t need to hack model to output concatenated outputs for all tasks
-
Of course is general but not limited to training GLUE tasks.
-
Prints results beautifully.
Finetune_GLUE_with_fastai.ipynb: shows hot to use MultiTaskLearner
mutlit_task.py: define MultiTaskLearner
and other classes
Notes for Mechanism:
-
cycle
-
If you don’t just switch output layer of model for your multi-task learning
Just create your ownnn.module
classMultiTaskModel
and defineswitch
and__len__
methods, so that aftermodel.switch(i)
,model(x)
will use a internal model that can solve task i. It’s kind of like a bunch of models, implementation should be super easy.
Note for pasted results
-
I know I should post a result with multi-task training all GLUE tasks, but Colab told me out of memory and using vscode that connects to our lab server can’t print pretty training result sheet. Please tell me how to solve this dilema.
-
Actually Electra didn’t multi-task train GLUE tasks, which I learned after I created
MultiTaskLearner
. -
It is worse than single task learning, it may due to need to adjust task weitghts / GLUE is not suited for multi-task learning.
“Pretrain MLM and fintune on GLUE with fastai”
Previous posts.
-
TextDataLoader - faster/as fast as, but also with sliding window, cache, and bar
-
Novel Huggingface/nlp integration: train on and show_batch hf/nlp datasets
Also follow my twitter Richard Wang for updates of this series.
(I am going to reproduce the ELECTRA-small results in Table 8 of the paper, from scratch !!)