Hi all! I’ve been trying to figure out how to analyse feature importance for tabular data…There are a few relevant bits of code for Fastai1, but I’ve struggled to adapt them to the new code (I’m still pretty new at this!).
I’ve had a dig through the documentation, and it looks like I might have to re-write this from scratch once I’ve figure it out (as you tend to iterate over your features and computer information loss)…or is there an easier way of doing this that I’ve missed?
So, there’s a key difference in how permutation importance is done. What we do is permute the entire dataset and operate with this, shuffle all the rows around, and see how the results varied. “Technically” you could do this by having one row with a known test input, another that is completely noise, and have it run it. I’m not entirely sure how well that would work, but its one option. You can even still use the same function (I think), just have a df of two rows.
Edit: Thinking on this more, a change to that func would likely be needed, since we only have one labelled row. Specifically we would want to see the predicted answer for that row, and its variance. But you can use that func as a baseline for writing your own
Otherwise there are better models suited for this stuff, specifically the ones that use attention. If you take a look at my TabNet notebook here, you can see that since attention is baked into the model, we can see individual feature importance in each row (and this is true for the Transformer-based tabular architectures as well):
Key Differences Between the Two:
Permutation importance needs labeled data to understand how the results did