Uncommon Descent Serving The Intelligent Design Community

On FSCO/I vs. Needles and Haystacks (as well as elephants in rooms)

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Sometimes, the very dismissiveness of hyperskeptical objections is their undoing, as in this case from TSZ:

Pesky EleP(T|H)ant

Over at Uncommon Descent KirosFocus repeats the same old bignum arguments as always. He seems to enjoy the ‘needle in a haystack’ metaphor, but I’d like to counter by asking how does he know he’s not searching for a needle in a needle stack? . . .

What had happened, is that on June 24th, I had posted a discussion here at UD on what Functionally Specific Complex Organisation and associated Information (FSCO/I) is about, including this summary infographic:

csi_defnInstead of addressing what this actually does, RTH of TSZ sought to strawmannise and rhetorically dismiss it by an allusion to the 2005 Dembski expression for Complex Specified Information, CSI:

χ = – log2[10^120 ·ϕS(T)·P(T|H)].

–> χ is “chi” and ϕ is “phi” (where, CSI exists if Chi > ~ 1)

. . . failing to understand — as did the sock-puppet Mathgrrrl [not to be confused with the Calculus prof who uses that improperly appropriated handle) — that by simply moving forward to the extraction of the information and threshold terms involved, this expression reduces as follows:

To simplify and build a more “practical” mathematical model, we note that information theory researchers Shannon and Hartley showed us how to measure information by changing probability into a log measure that allows pieces of information to add up naturally:

Ip = – log p, in bits if the base is 2. That is where the now familiar unit, the bit, comes from. Where we may observe from say — as just one of many examples of a standard result — Principles of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2:

Let us consider a communication system in which the allowable messages are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [[–> My nb: i.e. the a posteriori probability in my online discussion here is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by

I_k = (def) log_2  1/p_k   (13.2-1)

xxi: So, since 10^120 ~ 2^398, we may “boil down” the Dembski metric using some algebra — i.e. substituting and simplifying the three terms in order — as log(p*q*r) = log(p) + log(q ) + log(r) and log(1/p) = log (p):

Chi = – log2(2^398 * D2 * p), in bits,  and where also D2 = ϕS(T)
Chi = Ip – (398 + K2), where now: log2 (D2 ) = K
That is, chi is a metric of bits from a zone of interest, beyond a threshold of “sufficient complexity to not plausibly be the result of chance,”  (398 + K2).  So,
(a) since (398 + K2) tends to at most 500 bits on the gamut of our solar system [[our practical universe, for chemical interactions! ( . . . if you want , 1,000 bits would be a limit for the observable cosmos)] and
(b) as we can define and introduce a dummy variable for specificity, S, where
(c) S = 1 or 0 according as the observed configuration, E, is on objective analysis specific to a narrow and independently describable zone of interest, T:

Chi =  Ip*S – 500, in bits beyond a “complex enough” threshold

  • NB: If S = 0, this locks us at Chi = – 500; and, if Ip is less than 500 bits, Chi will be negative even if S is positive.
  • E.g.: a string of 501 coins tossed at random will have S = 0, but if the coins are arranged to spell out a message in English using the ASCII code [[notice independent specification of a narrow zone of possible configurations, T], Chi will — unsurprisingly — be positive.

explan_filter

  • S goes to 1 when we have objective grounds — to be explained case by case — to assign that value.
  • That is, we need to justify why we think the observed cases E come from a narrow zone of interest, T, that is independently describable, not just a list of members E1, E2, E3 . . . ; in short, we must have a reasonable criterion that allows us to build or recognise cases Ei from T, without resorting to an arbitrary list.
  • A string at random is a list with one member, but if we pick it as a password, it is now a zone with one member.  (Where also, a lottery, is a sort of inverse password game where we pay for the privilege; and where the complexity has to be carefully managed to make it winnable. )
  • An obvious example of such a zone T, is code symbol strings of a given length that work in a programme or communicate meaningful statements in a language based on its grammar, vocabulary etc. This paragraph is a case in point, which can be contrasted with typical random strings ( . . . 68gsdesnmyw . . . ) or repetitive ones ( . . . ftftftft . . . ); where we can also see by this case how such a case can enfold random and repetitive sub-strings.
  • Arguably — and of course this is hotly disputed — DNA protein and regulatory codes are another. Design theorists argue that the only observed adequate cause for such is a process of intelligently directed configuration, i.e. of  design, so we are justified in taking such a case as a reliable sign of such a cause having been at work. (Thus, the sign then counts as evidence pointing to a perhaps otherwise unknown designer having been at work.)
  • So also, to overthrow the design inference, a valid counter example would be needed, a case where blind mechanical necessity and/or blind chance produces such functionally specific, complex information. (Points xiv – xvi above outline why that will be hard indeed to come up with. There are literally billions of cases where FSCI is observed to come from design.)

xxii: So, we have some reason to suggest that if something, E, is based on specific information describable in a way that does not just quote E and requires at least 500 specific bits to store the specific information, then the most reasonable explanation for the cause of E is that it was designed. The metric may be directly applied to biological cases:

Using Durston’s Fits values — functionally specific bits — from his Table 1, to quantify I, so also  accepting functionality on specific sequences as showing specificity giving S = 1, we may apply the simplified Chi_500 metric of bits beyond the threshold:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond
SecY: 342 AA, 688 fits, Chi: 188 bits beyond
Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond

Where, of course, there are many well known ways to obtain the information content of an entity, which automatically addresses the “how do you evaluate p(T|H)” issue. (As has been repeatedly pointed out, just insistently ignored in the rhetorical intent to seize upon a dismissive talking point.)

There is no elephant in the room.

Apart from . . . the usual one design objectors generally refuse to address, selective hyperskepticism.

But also, RTH imagines there is a whole field of needles, refusing to accept that many relevant complex entities are critically dependent on having the right parts, correctly arranged, coupled and organised in order to function.

That is, there are indeed empirically and analytically well founded narrow zones of functional configs in the space of possible configs. By far and away most of the ways in which the parts of a watch may be arranged — even leaving off the ever so many more ways they can be scattered across a planet or solar system– will not work.

The reality of narrow and recognisable zones T in large spaces W beyond the blind sampling capacity — that’s yet another concern — of a solar system of 10^57 atoms or an observed cosmos of 10^80 or so atoms and 10^17 s or so duration, is patent. (And if RTH wishes to dismiss this, let him show us observed cases of life spontaneously organising itself out of reasonable components, say soup cans. Or, of watches created by shaking parts in drums, or of recognisable English text strings of at least 72 characters being created through random text generation . . . which last is a simple case that is WLOG, as the infographic points out. As, 3D functional arrangements can be reduced to code strings, per AutoCAD etc.)

Finally, when the material issue is sampling, we do not need to generate grand probability calculations.

The proverbial needle in the haystack
The proverbial needle in the haystack

For, once we are reasonably confident that we are looking at deeply isolated zones in a field of possibilities, it is simple to show that unless a “search” is so “biased” as to be decidedly not random and decidedly not blind, only a blind sample on a scope sufficient to make it reasonably likely to catch zones T in the field W would be a plausible blind chance + mechanical necessity causal account.

But, 500 – 1,000 bits (a rather conservative threshold relative to what we see in just the genomes of life forms) of FSCO/I is (as the infographic shows) far more than enough to demolish that hope. For 500 bits, one can see that to give every one of the 10^57 atoms of our solar system a tray of 500 H/T coins tossed and inspected every 10^-14 s — a fast ionic reaction rate — would sample as one straw to a cubical haystack 1,000 LY across, about as thick as our galaxy’s central bulge. If such a haystack were superposed on our galactic neighbourhood and we were to take a blind, reasonably random one-straw sized sample it would with maximum likelihood be straw.

As in, empirically impossible, or if you insist, all but impossible.

 

It seems that objectors to design inferences on FSCO/I have been reduced to clutching at straws. END

Comments
jerry: I will try to understand better the points you refer to. But, frankly, until now I cannot see any novelty in the few things I have read. I absolutely agree about the importance and roles of transposons and non coding sequences. I have argued many times that transposons are an important "engine (tool) of design". But, if transposons, or any other thing, are interpreted as random "engines of variation", I reallt can't see how they can help solve the problem of complex functional information. They can't. And I must confess that I have some difficulties to cope with definitions like the following: "Here we designate as a “nuon” any stretch of nucleic acid sequence that may be identifiable by any criterion." However, if you have some passages that start to suggest how nuons or any other concept can help solve the problem of big numbers in complex functional information, please be kind and point directly to them.gpuccio
August 23, 2014
August
08
Aug
23
23
2014
02:55 PM
2
02
55
PM
PDT
check this out - don't miss the insightful comments by Silver Asiatic.. https://uncommondescent.com/evolution/a-third-way-of-evolution/#comment-511598Dionisio
August 23, 2014
August
08
Aug
23
23
2014
02:47 PM
2
02
47
PM
PDT
Jerry, more stuff for the get around to read pile . . . silly season here, three weeks to go. KFkairosfocus
August 23, 2014
August
08
Aug
23
23
2014
12:20 PM
12
12
20
PM
PDT
Mung: I hear you, I am using the basic "standard" metric, which allows measures for info to add naturally, using properties of logs. Recall, we are going to move from info-carrying capacity to functional-complexity metrics by imposing conditions. When it comes to the Shannon metric, that is actually avg info per symbol, hence the weighted sum metric Sum of pi log pi. And yes, entropy, in being a metric of degree of microscopic freedom [or, lack of specificity] consistent with a given macro state, is a measure of missing info that would specify the microstate. But then, that becomes a subject of controversy as that is not the usual approach. Just as, Garrison's interesting look at macroeconomics from an Austrian perspective based on Hayek's investment triangle is illuminating but controversial. I find it is a tool that gives one perspective and highlights one cluster of concerns that happen to be relevant. I don't claim it captures the whole story. But it does catch a useful aspect. KFkairosfocus
August 23, 2014
August
08
Aug
23
23
2014
12:19 PM
12
12
19
PM
PDT
KF and others. Here is Brosius'web page and some comments from it. http://zmbe.uni-muenster.de/institutes/iep/iepmain_de.htm
Interests: 1) Our primary focus is on non-protein coding RNAs (npcRNAs) with emphasis on those preferentially expressed in the brain, ranging from discovery to function. This includes gene deletions (the epigenetically regulated MBII-85 RNA cluster) leading to mouse models of Prader-Willi-Syndrome (a neurodevelopmental disease), as well as certain forms of epilepsy that are mediated by dysregulation of protein biosynthesis near synapses and hyperexcitability due to absence of BC1 RNA. Wherever possible, we use transgenic mouse models ("in vivo veritas") in diverse studies, such as the regulation of neuronal expression of npcRNAs, sub-cellular transport of RNAs into neuronal processes and functional compensation of gene-deleted animal models. 2) From such simple beginnings of a protocell consisting of a small number of RNA molecules, we are still able to learn lessons on how modern genomes and genes evolve(d). The conversion of RNA to DNA via retroposition remains a major factor in providing raw material for future (nondirected) evolvability. Much of this process generates superfluous DNA. Yet, occasionally new modules of existing genes can be, by chance exapted (recruited) from such previously non-functional sequences. More recently, culminating in the "sales"-effort of the ENCODE consortium, the pendulum has almost swung towards the opposite extreme. This falsely implies that almost 80% of the human genome and, in analogy, that almost every chunk of transcribed RNA is functional (see cartoon). For an excellent assessment of this recent excess see Graur et al. (2013) and for the noisy transcriptome see this paper. We also use retroposed elements to infer phylogenetic relationships in vertebrates, chiefly in mammals and birds. Also, the academic content that we present in our teaching relies heavily on evolutionary thought. Apart from a better understanding of how cells and organisms work, it provides us with valuable tools to understand the evolution of genomes and genes. For medical students, it addresses the question "why we get sick" in a fundamental way. Finally, many bioethical questions posed by our growing capabilities in medical technology, such as gene therapy, assisted reproduction etc. are rendered more tangible “in the light of evolution”. Innovative analogies and radical thinking should free students from the restrictions and chains of much of their previous scholastic education. Likewise, evolutionary thought is potentially decisive in complicated patent disputes in the areas of life and biomedical sciences including biotechnology and RNA biology.
His CV and list of publications are: http://zmbe.uni-muenster.de/institutes/iep/Brosius_CV_AllPub.pdf The article in Vrba's book on macro evolution is here: http://www.bioone.org/doi/abs/10.1666/0094-8373%282005%29031%5B0001%3ADAEBAC%5D2.0.CO%3B2 Here is a comment I made about this a few months ago: https://uncommondescent.com/genetics/cost-of-maintenance-and-construction-of-design-neutral-theory-supports-id-andor-creation/#comment-498336jerry
August 23, 2014
August
08
Aug
23
23
2014
09:51 AM
9
09
51
AM
PDT
Jerry, https://www.uni-muenster.de/forschungaz/person/13019 http://zmbe.uni-muenster.de/institutes/iep/iepmain_de.htmMung
August 23, 2014
August
08
Aug
23
23
2014
09:41 AM
9
09
41
AM
PDT
jerry: I have looked on Pubmed, and there are a lot of papers with that name. Almost all the most recent deal with transposons and their role in evolution, and with RNA genes. Could you please point to some specific paper that, in your opinion, deals with the probabilistic analysis of the emergence of new genes?gpuccio
August 23, 2014
August
08
Aug
23
23
2014
09:40 AM
9
09
40
AM
PDT
hi kf, I'd like to suggest a slight modification to what you write regarding Shannon and Harley showing us how to measure information. "we note that information theory researchers Shannon and Hartley showed us how to measure information [given a probability distribution] by changing probability into a log measure that allows pieces [the units] of information to add up naturally:" For purposes of measuring information Shannon's metric has only limited applicability, so it's an important distinction to make, imo. Otoh, it can be applied to any probability distribution, which accounts for it's usefulness. As an aside, it turns out that this is just what we have in statistical thermodynamics, a special subset of probability distributions, and it turns out that the entropy is just Shannon's measure of information applied to this further subset of probability distributions, providing a meaning of entropy. (Information (Shannon Measure of Information (Entropy))) regardsMung
August 23, 2014
August
08
Aug
23
23
2014
09:31 AM
9
09
31
AM
PDT
I suggest that Behe, Axe etc read Brosius' research. He acts like it is no big deal and he is one of the major players in evolutionary biology. If you have access to a university library that has a good electronic journal list a lot of his papers are available as PDFs. You can also get a lot of them off his personal site. He runs a big research program at a university in Bavaria. I am on my iPad at the moment and will have to look at my computer for his website.jerry
August 23, 2014
August
08
Aug
23
23
2014
09:20 AM
9
09
20
AM
PDT
Jerry, Amino Acid sequence space is actually one of the strongest points showing islands of function, with protein fold domains deeply isolated and is it half of the domains being very sparse, without nearby antecedents so there is not a stepping stones model. And, we have not got to the even more thorny issue of regulating and the like. KFkairosfocus
August 23, 2014
August
08
Aug
23
23
2014
08:47 AM
8
08
47
AM
PDT
KF, I know of only one serious line from the naturalistic side that disputes the big number argument. That is the work of Juergen Brosius and his colleagues. He is a prolific publisher of research claiming new proteins arise all the time through various mutation processes. He was a colleague of Stephen Gould and is from Munich. As far as he is concerned, macro evolution is a done deal. Allen MacNeill pointed to his research as a basis for his claims that the engines of variation were adequate to explain all macro evolutionary change. I believe Larry Moran invoked him too. My guess is that he too will fail to overcome the big number problem but he acts as if he has.jerry
August 23, 2014
August
08
Aug
23
23
2014
08:37 AM
8
08
37
AM
PDT
PS: RTH & AF et al If you don't like chirping cricket metaphors, then the OP above shows that elephant in room ones will do . . . but not in the way you hoped. And no, tagging the pivotal issue as a "bignum" argument then using a strawman tactic dismissal will not do. And in case P May is around, a log reduction as above that shows the way the 2005 metric turns into an info beyond a threshold of complexity metric is NOT a probability argument . . . a blunder his Mathgrrl sock puppet made that gave away the game he played at UD. (Please, answer to the merits of the matter, and as I don't generally hang about at TSZ, if you choose to respond there then let us know. The other objector sites are so bad that reasonable people will only go there under protest.)kairosfocus
August 23, 2014
August
08
Aug
23
23
2014
06:41 AM
6
06
41
AM
PDT
A FTR/FYI for RTH and other denizens at TSZ.kairosfocus
August 23, 2014
August
08
Aug
23
23
2014
05:44 AM
5
05
44
AM
PDT
1 7 8 9

Leave a Reply