How to get RNNCore output to a callback?

geniki · November 4, 2018, 9:34pm

Hello,

I’d like to try out Hebbian Softmax training (https://arxiv.org/abs/1803.10049) which improves LM results, especially on infrequent words.

The algorithm would fit naturally as on_loss_begin callback because it requires the outputs of the RNN before the LM decoder module. That’s returned by the LM (outputs in https://github.com/fastai/fastai/blob/master/fastai/text/models.py#L150), however the RNNTrainer callback removes it (https://github.com/fastai/fastai/blob/master/fastai/callbacks/rnn.py#L17) and that callback is hardcoded to go first in the list of callbacks.

There are multiple non-clean ways to get those outputs to another callback but I was curious if people here had any advice on what would be most natural way to do it given the design of the library and any forthcoming changes.

Thank you.

sgugger · November 5, 2018, 1:12am

Two options:

give your callback an order that is below the RRN_Trainer (each callback has an order attribute)
rewrite it to suit your needs.

geniki · November 6, 2018, 9:23am

Thanks, @sgugger. I’d missed the _order attribute and it does exactly what I need.

Do you have recipes for training LMs from scratch and/or finetuning your WT103 one? Are the hyperparameters in the imdb notebook the latest recommended ones for finetuning?

Based on other experiments, Hebbian Softmax should help the AWD-LSTM architecture when training from scratch but I would life to verify this within the fastai library and also make it useful for finetuning.