The well-known quote from Andrew Lang reads as follows: „The statistician uses statistics as a drunken man uses lamp posts—for support rather than illumination.”. It is easy for a mathematician or a statistician to interpret the result of a statistical analysis with caution, but one, who is only interested in the result and less familiar with the mathematical background of the used methods, can easily jump to a wrong conclusion. The simplest example, which points out, why prudence is needed in the implementation of statsitical results, is the Simpson’s paradox (described firstly by Edward H. Simpson in 1951)
Consider the following study. A new drug is being tested on a group of 800 people (400 men and 400 women) with a particular disease. The aim is to establish whether there is a link between taking the drug and recovery from the disease. In a standard scenario half of the people (randomly selected) are given the drug and the other half are given placebo. The results in the following table show that, of the 400 given the drug, 200 (50 %) recover from the disease; this compares favourably with just 160 out of the 400 (40 %) given the placebo who recover.
So clearly we can conclude that the drug has positive effect. Or can we? A more detailed look at the data results in exactly the opposite conclusion. Specifically, the following table shows the results when brokan down into male and female subjects.
|Recovery rate||30 %||20 %||70 %||60 %|
Focusing first on he men, we find that 70 % taking the palcebo recover, but only 60 % taking the drug recover. So, formen, the recovery rate is better without the drug. Similarly, with the women we find that 30 % taking the palcebo recover, but only 20 % taking the drug recover. So, for women, the recovery rate is also better without the drug. So we can conclude, in every subcategory the drug is worse than the placebo.
The process of drilling down into the data this way (in this case by looking at men and women separately) is called stratification. Simpson’s paradox is simply the observation that, on the same data, stratified versions of the data can produce the opposite result to non-stratified versions. Often, there is a causal explanation. In this case men are much more likely to recover naturally from this disease than women. Although an equal number of subjects overall were given the drug as were given the placebo, and although there were an equal number of men and women overall in the trial, the drug was not equally distributed between men and women. More men than women were given the drug. Because of the men’s higher natural recovery rate, overall more people in the trial recovered when given the drug than when given the placebo.
Someone may ask the questions, ’Does this difficulty arise in more general case (e.g. if we stratify the data into more subgroup)? ’ or ’How can we avoid this kind of effects?’. For answers, an more details please refer the following articles: