Saliency maps for the MultiFiT model

Hi everyone,
I am working on the creation of saliency maps for MultiFiT classification models, with the ultimate goal of highlighting over the text those parts that are decisive in making a prediction.

In order to make the saliency maps, on the one hand I need to obtain the activations of a designated layer of the model using a hook, and on the other hand I need the tokenized text and its tensor form. In this way I can relate the activations to the input text that caused those activations.

The question is, how can i get the tokenized text and its corresponding input tensor that the MultiFiT model receives when doing a prediction?, taking into account that those operations are done internally by MultiFiT.

Thanks

Hi Javier and welcome to our community.

Have you seen text.interpret? It should work with MultiFiT.

1 Like

Hi Marcin, first of all, thanks for your reply.
I’ve had a look at text.interpret , and it looks pretty interesting.
Unfortunately I’ve been testing it with my trained multifit model, and at the moment I have not managed to get it to work.
When I try : interp = TextClassificationInterpretation.from_learner(my_learner) It gives me the following error: “IndexError: list index out of range”.
However, multifit model is used following this repository(original) and works fairly good. Any idea of what could be happening? Maybe the text interpreter doesn’t work well with multifit?

By default from_learner computes predictions on a validation set to use them in show_top_losses. Do you have any item in your validation set?
If you’re using an exported learner with an empty databunch and you care only about the intrinsic attention, you can try something like:

dummy = torch.Tensor([[1]])  # .cuda()
interp = TextClassificationInterpretation(multifit_learner, dummy, None, None, DatasetType.Test)
interp.show_intrinsic_attention("あなたの例")