Proven: If you torture a Big Data enough, it will confess to anything


The Texas Sharpshooter has been known to inflict that type of pain:

You will get a chortle or two from Spurious Correlations, a web page devoted to graphically persuasive relationships among pairs of sets of entirely unrelated data. For example, you can see the graph of “US spending on science, space and technology” superimposed on that of “Suicides by hanging, strangulation and suffocation.” The staggering 99.79% overlap is a classic in correlation without causation.

Likewise, “Per capita cheese consumption” and “People who died by becoming tangled in their bedsheets” has a correlation of 94.71%. And the correlation between “People who drowned after falling out of a fishing boat” and the “Marriage rate in Kentucky” is 95.24%.

Common sense tells us to treat these coincidences as jokes. But in his fascinating new book The AI Delusion, economics professor Gary Smith reminds us that computers don’t have common sense. He also notes that, as data gets larger and larger, nonsensical coincidences become more probable, not less. Robert J. Marks, “Study Shows Eating Raisins Causes Plantar Warts” at Mind Matters

Robert J. Marks is one of the authors of Introduction to Evolutionary Informatics.

Take this one on: there’s a distinct correlation between which NFL Conference wins the Super Bowl and stock market levels. Should we look further into that one? I think you see my point.
Our initial instinct would tell us that it would be ridiculous to try to draw any relationship between the two. However, what if one conference is dominated by teams located in large cities with very large economies? All I am suggesting is that sometimes correlations that on the surface appear to be ridiculous may, when we look into them deeper, mask some indirect relationships. Ed George
Ed George: Take this one on: there's a distinct correlation between which NFL Conference wins the Super Bowl and stock market levels. Should we look further into that one? I think you see my point. PaV
I read once that there was also a correlation between atmospheric lead levels and the incidents of violence. At first, we may conclude that this is nonsense, but when we look at the affects that lead has on the brain, a little more credence is lent to the possibility that there is a causal relationship. Correlation is a very valuable and powerful tool, but it should only be the starting point of further investigation, not flaunted as some sort of conclusive argument. For example, in the OP Study of religion takes evidence-based turn, News mentioned the correlation between religious lifestyles and a longer life. Playing devil's advocate, I linked to a paper that found a relationship between religious belief and higher suicide rates amongst homosexuals. Are either of these causal correlations? Probably not, but they are both important pieces of information that could lead to further examinations that might lead to benefits for everyone. Ed George

