Yes, recently, we learned from a highly official source that statisticians are in some kind of a panic:
While the crisis of statistics has made it to the headlines, that of mathematical modelling hasn’t. Something can be learned comparing the two, and looking at other instances of production of numbers.Sociology of quantification and post-normal science can help.
While statistical and mathematical modelling share important features, they don’t seem to share the same sense of crisis. Statisticians appear mired in an academic and mediatic debate where even the concept of significance appears challenged, while more sedate tones prevail in the various communities of mathematical modelling. This is perhaps because, unlike statistics, mathematical modelling is not a discipline. It cannot discuss possible fixes in disciplinary fora under the supervision of recognised leaders. It cannot issue authoritative statements of concern from relevant institutions such as e.g., the American Statistical Association or the columns of Nature.
Andrea Saltelli, “A short comment on statistical versus mathematical modelling” at Nature
So what’s going on? Our physics color commentator Rob Sheldon offers,
The author of this article is contrasting the growing sense of panic in statisticians, with the complacency of modelers.
The panic in sociology, psychology, nutrition science, and pharmacology has been growing as >70% papers with “p-values” smaller than 0.05 are discovered to be unrepeatable.
Since the “p-value” is a statistical quantity invented by Ronald Fisher and is tied to “frequentist” statistics, the competing “Bayesian” statisticians have claimed that the method is deeply flawed. That battle is not new, having been fought since the year that Fisher introduced his p-value, but until recently, had been won by the frequentists. Today, Bayesian methods are not just widely popular, but have replaced frequentists in many niche fields, so that the “irreproducibility” crisis is not simply pointing the finger at a few fraudulent bad apples, but at an entire educational system that promoted p-hacking.
By contrast, modellers have been growing in prestige and fame year upon year. For example, in 2018, nine Neanderthal genomes had been sequenced, and one Denisovan genome.
Yet we have a news item this week, typical of recent news items, which claims that Neanderthals carry 1% of their genes from previous encounters with modern humans.
How do they figure this out? Especially since we have zero genomes from Modern humans that predate Neanderthals?
Models.
But how do we know, asks Andrea Saltelli, if our models are valid? Can we run calibration tests on them with known answers? How about simple consistency checks? What about stating all our assumptions up front?
Nope, nope, and double nope. Modellers get a free pass, while statisticians get the bright lights in their eyes and the grilling from unseen questioners, with the threat of retracted papers and tenure-destroying expulsion.
Saltelli then goes on to show a rather disturbing plot. The more complicated our model becomes, the more ability it has to match our actual data. If you have only two data points, a model needs only two free parameters, and it can find a line through both those points. If you have 3 points, you can find a curve, a quadratic polynomial that will go through them. As long as you have as many free parameters as there are data points, there is always a curve that goes directly through all the points.
But is this increasingly complex mathematical model valid?
The way to test it, is to find one additional point, and see if the curve for n-1 points matches this last point. And weirdly enough, when the model has too many free parameters, it gets more and more “unstable”, more and more “wiggly” as it strains to perfectly match the previous data, with less and less likelihood of matching new data. This is what Saltelli’s disturbing plot shows, that the model error is minimized somewhere in the middle of the “complexity” axis.
So rather than complimenting our modellers (think global climate models) for matching past data perfectly by adding in adjustable variables (aerosols, feedback), we should be suspicious that they are actually making their predictions worse by overcomplicating them.
And it isn’t just Neanderthal genetics and global climate models. This is true for every area of science, from cosmology to particle physics to cladistics and AI. This is why IBM is abandoning “Deep Mind.” The problem wasn’t fixed by throwing more complexity at it.
So rather than being complacent, modellers ought to be in an equal state of panic as statisticians. Saltelli is not abandoning modelling, he just wants it to be ethical. From his concluding paragraph: “While this vision is gaining new traction [sociology of modelers working with suppliers of data and users of models] more could be done. A new ethics of quantification must be nurtured.”
Perhaps this is all part of the Paley renaissance, recognizing that the days of coddled dogmatics and their supporting cast of modellers are coming to an end.
See also: Confirmed: Deep Mind’s deepest mind is on leave. The chess champ computer system just never made money
Note:Rob Sheldon is the author of Genesis: The Long Ascent
Follow UD News at Twitter!