Not really, in fact the SHAP paper seems put a lot of effort to optimize, as the brute force way to compute is O(2^n). In my trial, I use 1000 sample points, with your permutation importance it takes 8 seconds, while it takes almost 4 minutes for SHAP value computation.

I think it is really complicated, thatâ€™s why I rather experiment with it as I figure out I will never understand anything by reading the math myselfâ€¦

I think you are right, now I get what the author talking in the article https://towardsdatascience.com/interpretable-machine-learning-with-xgboost-9ec80d148d27, the problem of permutation importance is that you cannot sum up individual contribution(Although I found it may not be a big problem? Is there some scenario we really want to sum it up?)

The reason I want to normalize it to 1 is to compare these importance in a kind of same scale. Maybe the relative importance (order) is more important, they seems suggesting different thing in my trial.