It is a plague, not a cure, says Pomona College statistics prof Gary Smith:
Decades ago, data mining was considered a sin comparable to plagiarism. Today, the data mining plague is seemingly everywhere, cropping up in medicine, economics, management, and, now, history. Scientific historical analyses are inevitably based on data documents, fossils, drawings, oral traditions, artifacts, and more. But now, historians are being urged to embrace the data deluge as teams systematically assemble large digital collections of historical data that can be data mined…
The promise is that an embrace of formal statistical tests can make history more scientific. The peril is the ill-founded idea that useful models can be revealed by discovering unanticipated patterns in large databases where meaningless patterns are endemic. Statisticians bearing algorithms are a poor substitute for expertise.
For example, one algorithm that was used to generate missing values in a historical database concluded that Cuzco, the capital of the Inca Empire, once had only 62 inhabitants, while its largest settlement had 17,856 inhabitants. Humans would know better.Gary Smith, “Data mining: A plague, not a cure” at Mind Matters News
He adds, “Findings patterns in data is easy. Finding meaningful patterns that have a logical basis and can be used to make accurate predictions is elusive. We can see this from 18th-century attempts to cure scurvy through 21st century claims about the stock market or history. “
Coronavirus: Is data mining failing its first really big test? Computers scanning thousands of paper don’t seem to be providing answers for COVID-19. (Robert J. Marks)