Hi, I am having trouble with normalizing the feature importance.
Here is the difference in RMSE between the “shuffled” column and the “original” column:
After normalizing it from 0 to 1, I get:
fi (bottom) is the feature importance from sklearn and
fi2 (top) is the one I did myself. It’s somewhat similar but the numbers are a bit off.
Any pointers on what I could be doing wrong? My method of normalizing right now is just dividing by the sum of the feature importances.
It depends on the feature importance implementation in scikit, what metric they are using and how they are calculating it. In general, I would probably focus on my metric of interest (business or Kaggle) then calculate validation differences for each shuffled feature. So here there is no perfect solution, but this approach at the same time makes it very flexible and powerful. It is metric and model agnostic
So, as long as you are confident about what you measure I wouldn’t worry much about getting it “right”
For rf classifier for example: