Data Science

Simpson’s Paradox: The Reason Why Your KPIs Are Meaningless!

Pinterest LinkedIn Tumblr

I have a story to tell…

I was sitting at another investor meeting of one of my previous companies where we had to explain ourselves why our latest measures did not increase but rather decreased our average order value (it went down from 200€ to 180) while spending 100,000€ 😱 on developing new product features to increase the basket.

Nobody in the room understood that I was still happy about the development of the users’ basket value, even if the average dropped.

It’s because I knew something they didn’t. I knew about the Simpson’s Paradox. I will tell you the outcome of this story at the end.

The simplest thing to do when faced with big data sets and trying to figure them out is to make an average, right? Well it might be the simplest solution, but is it one that will give you the correct answer? 🤔

What Is Simpson’s Paradox?

Well, to start with, it’s nothing to do with Homer Simpson…

Simpson’s Paradox occurs when trends that appear when a dataset is separated into groups reverse when the data is aggregated together and the same calculations made.

Who Invented Simpson’s Paradox?

The Simpson’s Paradox was first described by Edward H. Simpson, a British statistican and former cryptanalyst at Bletchley Park, in a paper called ‚The Interpretation of Interaction in Contingency Tables‘ from the Journal of Royal Statistical Society in 1952. The name Simpson’s Paradox was introduced by Colin R. Blyth in 1972; it was first described by Udny Yule in 1903, using the hypothetical example of an ineffective anti-toxin.

How Does It Work?

Say you’re trying to pick a restaurant to go to with your partner. Neither of you have much local knowledge, so you start looking at reviews, and find that one restaurant is rated most highly by a proportion of men and a proportion of women. But then you see that the other restaurant on your shortlist has the highest rating when looking at all users. How does this work? 

If numbers and data are factual, how can the same dataset can be used to prove two opposing arguments? 🤔

The trouble is that just by looking at averages, you’re not taking account of the sample size. So if there are more men than women responding for one restaurant, and then more women for the other, it will skew the averages.

Another quirk of Simpson’s Paradox that can catch you out is when a correlation that points in one direction in specific groups becomes a correlation in the opposite direction when bundled together and looked at for the entire population.

Back To My Story…

There is a huge gap between the average order value of new customers and existing customers (250€ existing to 130€ new 😲).

Both baskets where increased by 20% by the new features but during the time period measured we acquired many more new customers than in the previous month. This led to an overall decrease of the average order value even if the average order value for both individual segments (new and existing) grew. So you can see that by knowing that extra bit of information – that many more new customers joined us in the time period we were looking at than had done in the previous period – we were able to make rational judgements based on the real life facts 😏.

What Is The Solution?

What you need to do is consider the process taken to gather the data. We call this the causal model. How was the data generated and what factors influence the results? It’s likely that there are factors that impact the results that we are not shown, and thus the data is not the whole picture. Maybe men tend to rate restaurants lower than women? Maybe more women visited the second restaurant? Data on its own is never enough. We need to ask questions about how it was generated, what other influences are involved, and what results can be challenged? 

Software can help! It is not enough to just look at the data. You need to ask questions of it. The art of data science is seeing beyond the data and using initiation and real world knowledge to get to the heart of it. Pioneering software RetentionX can analyze millions of these segments and correlations through machine learning. They can assess for more data points that humans can manually. Using this software enables you to work out what is really happening in your business, what the real drivers and levers of growth and change are, and help you to make the right decisions so your business can flourish.  



Noah Xiao, „The Most Counter-intuitive Probability Problems“
Will Koehrsen, „Simpson’s Paradox: How to Prove Opposite Arguments with the Same Data“


Data Science Enthusiast, growing revenues with the power of data – Founder & CEO of RetentionX, the leading Software for Decision Intelligence.

Write A Comment