Captum model interpretability library

j.laute · March 25, 2020, 12:45am

Wow, the field of machine learning Interpretability is fascinating, I’m quite overwhelmed by all information to be honest.

I’ve read these articles from distill.pub:
The Building Blocks of Interpretability
Feature Visualization
Visualizing the Impact of Feature Attribution Baselines

I also went through all tutorials on the Captum github, and the library seems quite flexible and implements many of the published “Feature Attribution” algorithms (ie. which parts of the input to a model/layer cause the biggest change in its ouput).

Then there is the other important technique mentioned in these articles “Feature Visualization”, ie what input maximizes/minimizes a specific layer/neuron etc. (DeepDream is an example of this)

I think the challenge here is not writing a callback that exposes these functionalities, but rather to select sensible and robust defaults/algorithms as there are so many choices.

I’ll continue reading up on the literature for a bit, and then start experimenting on Imagenette I think, if anyone comes across interesting blogposts/papers/libraries please share them here

There’s also the tensorflow lucid library/notebooks, which contains code for the distill articles and more techniques for model visualization and interpretation, as well as a lot more recommended reading, which I’ll definitely check out tomorrow.