Correlation is not Causality’s Consolation Prize – By Melissa Kovacs, PhD, GStat
Most of us have heard, and sometimes even trot out the old saying, “correlation is not causality.” This is true. As a statistician, I am grateful this is a rather well-known maxim.
Sometimes these words are delivered with a bit of condescension towards our friend, correlation. However, correlation is an extremely important and valuable statistic.
Searching for correlations is a go-to tool in big data mining. By searching for every possible correlation that may exist in a set of factors, surprising relationships can emerge and signal something that’s been overlooked.
Correlations can be the first shiny light on a useful pattern, saying, hey! – turns out that cinnamon chewing gum purchases are higher on the weekends! Huh, maybe we should explore that…
Often, impactful business decisions can occur without sophisticated statistical techniques. Meaningful business insight happens with descriptive and diagnostic statistics.
This is especially true over time – the emergence of a correlation in one’s data should be a signal to pay attention, but the emergence of a pattern of correlations over time in the data can signal a true relationship or trend.
When kale sales to women shoppers are up in a quarter, it might be a fluke. But when kale sales to women shoppers increase quarter over quarter for the last 3 years, it’s a trend. Kale yeah!
Predictive and prescriptive statistics that result in causal conclusions require comprehensive, valid data – two really high bars for data to meet. This is frequently unavailable. How often can we know and control for every reason a consumer purchased their new electric car? But, what’s much more likely is the existence of good data for one factor that influenced that electric car purchase.
Sure, causality is the cool kid – with causality, we are really sure that one factor caused change in another factor. Such certainty is dreamy – providing surety to go on and prescribe how to make change happen.
But correlation, realistically and reliably available, can be used with great effect to understand the past and to provide meaningful business insights to drive future decisions. Correlation is cool, too!
Dr. Melissa Kovacs lectures on quantitative analytics and economics at Arizona State University. She also is the president of FirstEval, a top analytics services company.
Proof Analytics uses advanced data science and machine learning to clearly and reliably uncover cause-and-effect relationships at every level of your organization.
By applying a combination of proven, statistical methods to your time series data, Proof accurately computes causality: which investments are paying off in what ways with it’s