Quantifying the idea of Proxy

@rachel You mentioned in Lesson 1, 1:04:00, that the data for a model is often the proxy for a value we are interested in, and that the distance between the true value of interest and the data is large. Is there some way to measure this distance and figure out essentially how good is the data at predicting our true value? Also, following the example of predictive policing where the arrests are proxy for crime, are there recommendations, when building and testing a model, to decipher the various ways the model may now be used as a proxy for things we do not want? eg face characteristics – used for detecting criminality

While you wait for Rachel’s reply, this might be what she meant.