Possibly random thoughts of a oddly organized dba with a very short attention span


nonsense correlation

I was doing a little light reading on my Saturday night in my Oxford Dictionary of Statistics by Graham Upton and Ian Cook and came across this definition:

nonsense correlation: A term used to describe a situation where two variables (X and Y, say) are correlated without being causally related to one another. The usual explanation is that they are both related to a third variable, Z. Often the third variable is time. For example, if we compare the price of a detached house in Edinburgh in 1920, 1930, ... with the size of the population of India at those dates, a 'significant' positive correlation will be found, since both variables have increased markedly with time. The first comprehensive study of nonsense correlation was undertaken in 1926 by Yule, who considered the apparent connection between the fall in Church of England marriages and the concurrent increase in life expectancy. See also GOOSEBERRY BUSHES; RUM CONSUMPTION


I had no idea there was an actual statistical term for this - one that I could look up in a real statistics dictionary written by a professor at the University of Essex, no less. Or that there might be funded comprehensive studies for such things.

My favorite example of the confusion between correlation and causation is the Pirate Effect, i.e. There are fewer pirates on the oceans. Global average temperatures are increasing. Global warming is therefore caused by lack of pirates.

And what about the studies that suggest lack of dental health is a causal factor in heart disease? Did they consider that Z = the individual's attention to general health maintenance?

And here's a professor who keeps a list of the headlines that suggest causal relationships when the research was correlative to teach his students that correlation != causation.

But what I really want to know is how do I get funding for a nonsense correlation study? I could come up with an endless supply of possibilities.

I think my first study shall be evaluating the increase in the number of colors in the Oracle installation screens with the parallel increase in the number of poorly configured databases. Clearly, multiple colors combined with pictures must affect the installers ability to make appropriate database configuration choices. Perhaps we could isolate the specific colors that cause this issue, remove them from the tool and improve the condition of databases everywhere.

Or maybe not ...

(sorry - I get cranky when I read statistics on Saturday night. Think I'll read about rum consumption next and start a self funded study of nonsense correlation)


Robyn said...

So much for the rum consumption nonsense correlation improving my Saturday night.

The rum has to be consumed in Havana. I'd have less than 42 minutes to get there and have it still be Saturday.

Chen Shapira said...

I hate nitpicking, but since my father is a professor of dental health, I really have to mention that the link between gum disease and heart health is not merely a matter of statistical correlation (which, as you said, does not imply causation). There are common bacteria that appear to be involved in both gum disease and heart disorders and therefore there is a deeper biological reasons to suspect a connection.

Alex Gorbachev said...

Learned net stats term today. Thanks Robyn!

Robyn said...

Hello Chen,

By all means, please nitpick on this topic. If there was a point to this post at all, it is that we should nitpick at the details of published research. (and I needed a break from what I was supposed to be doing.)

I realize there is a common bacteria for the gum disease/heart disease correlation but the information the public receives on this study and others related to health issue are sound bites that don't always capture the results accurately. Every time I hear about a new research study that proves X results in Y, I start wondering about the Z variable. i.e. Is there any other possible common variable that is really behind the correlation?

Using the dental health example, a responsible study would take the health habits of the sample set into consideration, along with many other factors, and present their conclusions carefully. A study funded by say, Johnson & Johnson, could be a little more inclined to ignore variable Z in order to sell more product. In fact if you look at the link on the Listerine page:


they state that 'one thing is clear: a healthy mouth only leads to good things.' Yet in the small print, they confirm that there is an association, but 'a cause and effect relationship has not been established.' If a causal relationship has not been established, then they don't know that X 'leads' to Y, they just know that X and Y tend to exist together. And since Johnson & Johnson has been called on misleading ads in the past, they're more likely than most to dot the legal i's.

Your father would have much more knowledge on this than I, and I'd love to get more details. Has the research confirmed that the common bacteria begins in people's mouth, and then moves to other parts of the body? Or does it begin elsewhere and then travel to multiple locations in the bloodstream? There's been some advertising by major dental chains suggesting that seeing your dentist 2 or 3 times a year will stop heart disease and diabetes. Not sure the study actually came to that conclusion.

I should have picked on the 'Milk Your Diet' study. That's an easier target.

cheers ... Robyn

disclaimer: I am very pro dental health. I see my dentist at least twice a year and I love my SonicCare toothbrush. It will be traveling to OOW with me. I am not advocating that anyone neglect their dental care in any way :)

Chen Shapira said...

Here's an explanation of the main mechanism thought to link gum disease with heart health: http://www.perio.org/consumer/bacteria.htm

To be fair, there is an alternative explanation: a specific gene could affect the immune system in a way that makes the person more susceptible to both conditions: http://www.medicalnewstoday.com/articles/151347.php
The research is rather new, but if they are right, then the conditions are connected by a third factor and flossing and seeing your dentist will not improve your heart health.

disclaimer: I very much support not believing any scientific fact presented in advertisements.

Some dentists even resort to linking erectile dysfunctions to gum disease! (http://www.medicalnewstoday.com/articles/152856.php) I've no idea if this is good research, but it sure makes for good advertisements.

(Are we off topic enough yet?)

Robyn said...

I suppose we have drifted away from Oracle a good bit, but we're still right on topic for nonsense correlation. Your series of links illustrate some good points:

1. Always look for a Z factor

2. Even studies backed by thorough research can be misrepresented

The news sources are almost as bad as the advertisers. In their rush to come up with dramatic sound bites, the study results get presented as confirmed causation, while the researchers are still talking about the need to verify a correlation.

Thanks for the contribution Chen

Search This Blog


My Blog List

disclaimer ...

The views expressed on this page are mine and do not reflect the opinions of my employer, former employers, other associates or the opinions I may hold 3 days from now.