I am using reviews to make text classification. Once the learner is trained, is there a way to extract the words which most influence the prediction of a certain class? This is something like, in vision, we can get what part of the image the model is looking at to make the prediction. I am looking for a similar thing in NLP. I am pretty new to NLP so I am not sure if this procedure already exists. I tried looking for it but did not find anything in the docs.
You could try text’s classification interpretation perhaps. I have not played around with it myself yet
I tried the interpretation. I was able to get the intrinsic attention. But I have to provide a review and it gives me a score for each word. I was looking for something where I can get a list of all the words which give a high score without providing individual reviews.