Dear fastai community,
I recently wrote an article to show how to incorporate pre-trained Pytorch-transformers models into fastai framework. The main issue needed to be addressed is that the Pytorch-transformer models would yield a tuple containing not only the forward output we are familiar with, but also other states such as attention weights. This issue is more detailed at here.
In my article, in the beginning, I used a “dumb” way by changing the fundamental training loop in the loss_batch function in fastai.basic_train module. This solution works great with no problems. However, it is not the best way when we have a flexible fastai callback system!
With the reminder from Jeremy, and help from @waydegilliam, I managed to write a callback that would do the work but with a small issue that I have no idea: the metric, accuracy_thresh, is not changing across epochs. The training and validation losses are changing but the metric stays the same.
I have put my results in this Google colab for your convenience. I would appreciate if anyone could shed some light on the problem of this issue and how to solve them if possible.