Simulation after prediction

jchaykow · October 22, 2018, 11:04pm

In lesson 6, I believe, Jeremy talks about using simulations to interpret the predictive models that you have built in the data analysis pipeline. What types of simulation techniques are used in this situation? i.e. once you have built all of your predictive models and you want to get to the business levers and make a business decision, what types of simulation procedures might you use to make use of all of the predictive models you built?

jchaykow · October 26, 2018, 6:25pm

Just wondering if anyone has any insight into this? I’m thinking maybe each column from the training data would have to be simulated by a domain expert in some fashion to obtain the overall environment that the data scientist wants to predict over? But how might that simulation occur? What methods would she use?

SimonWeiss · November 5, 2018, 8:27pm

You could use pretty simple business logic to start with. For example if you have a churn-like model that predicts for each user the probability of that user quitting your service, these probabilities alone are probably not enough for business decision (levers) such as “should I give this customer a 50€ coupon for her next order”. Maybe the user with the highest quitting probability is only expected to create 5€ revenue over the next year, hence, you woulnd’t want to give a 50€ coupon to that user. Also, what lever should you choose, call the customer, offer a coupon, invite to an event, etc.? All of these have different success rates of preventing churn and each lever is associated with different cost.

A simple “simulation” for this business problem could be to combine the estimated quitting probability with the estimated value of that customer (expected revenue) and maybe the success rate and cost associated with each lever to to simulate each combination and find the beast measures (levers) for each customer.
In the end you might find that you shouldn’t blindly care about the customers with the highest probability of quitting (the output of your predictive model).