Two recent remarks in VICE (a telling label, BTW) raise some significant concerns. First, Kevin Buzzard — no, this is not Babylon Bee [itself a sign when it is harder and harder to tell reality from satire] — Sept 26th:
Number Theorist Fears All Published Math Is Wrong
“I think there is a non-zero chance that some of our great castles are built on sand,” he said, arguing that we must begin to rely on AI to verify proofs.
[ . . . ]
Kevin Buzzard, a number theorist and professor of pure mathematics at Imperial College London, believes that it is time to create a new area of mathematics dedicated to the computerization of proofs. The greatest proofs have become so complex that practically no human on earth can understand all of their details, let alone verify them. He fears that many proofs widely considered to be true are wrong. Help is needed.
What is a proof? A proof is a demonstration of the truth of a mathematical statement. By proving things and learning new techniques of proof, people gain an understanding of math, which then filters out into other fields.
To create a proof, begin with some definitions. For example, define a set of numbers such as the integers, all the whole numbers from minus infinity to positive infinity. Write this set as: … , -2, -1, 0, 1, 2, … Next, state a theorem, for example, that there is no largest integer. The proof then consists in the logical reasoning that shows the theorem to be true or false, in this case, true. The logical steps in the proof rely on other, prior truths, which have already been accepted and proven. For example, that the number 1 is less than 2.
New proofs by professional mathematicians tend to rely on a whole host of prior results that have already been published and understood. But Buzzard says there are many cases where the prior proofs used to build new proofs are clearly not understood. For example, there are notable papers that openly cite unpublished work. This worries Buzzard.
“I’m suddenly concerned that all of published math is wrong because mathematicians are not checking the details, and I’ve seen them wrong before,” Buzzard told Motherboard while he was attending the 10th Interactive Theorem Proving conference in Portland, Oregon, where he gave the opening talk.
“I think there is a non-zero chance that some of our great castles are built on sand,” Buzzard wrote in a slide presentation. “But I think it’s small.”
Now, let us cross-multiply by another report, on Oct 10, on how a software glitch has created a problem in Chemistry (and potentially medicine):
A Code Glitch May Have Caused Errors In More Than 100 Published Studies
The discovery is a reminder that science is collaborative and ideally self-correcting, but that nothing can be taken for granted.
[ . . . ]
Scientists in Hawaiʻi have uncovered a glitch in a piece of code that could have yielded incorrect results in over 100 published studies that cited the original paper.
The glitch caused results of a common chemistry computation to vary depending on the operating system used, causing discrepancies among Mac, Windows, and Linux systems. The researchers published the revelation and a debugged version of the script, which amounts to roughly 1,000 lines of code, on Tuesday in the journal Organic Letters.
“This simple glitch in the original script calls into question the conclusions of a significant number of papers on a wide range of topics in a way that cannot be easily resolved from published information because the operating system is rarely mentioned,” the new paper reads. “Authors who used these scripts should certainly double-check their results and any relevant conclusions using the modified scripts in the [supplementary information].”
Yuheng Luo, a graduate student at the University of Hawaiʻi at Mānoa, discovered the glitch this summer when he was verifying the results of research conducted by chemistry professor Philip Williams on cyanobacteria. The aim of the project was to “try to find compounds that are effective against cancer,” Williams said . . . . Luo’s results did not match up with the NMR values that Williams’ group had previously calculated, and according to Sun, when his students ran the code on their computers, they realized that different operating systems were producing different results. Sun then adjusted the code to fix the glitch, which had to do with how different operating systems sort files.
Garbage in, garbage out — or perhaps, garbage in, [presumed] gospel out.
So system of reasoning is better than its inputs [including controlling assumptions or assertions] and structure.
As in, what about brains as computational substrates, too.
Food for thought as we ponder where an AI age takes us, especially when many hope to use AI systems to p[rove correctness of software.
What could go wrong . . . ? END