In the when it comes thread, an exchange has developed with GD, and I think it helpful to headline an argument at comment 49:
__________
>>I [KF] found an elementary introduction to statistical entropy very helpful, from the Russian authors Yavorsky and Pinski, in their Physics, vol I [1974]: as we consider a simple model of diffusion, let us think of ten white and ten black balls in two rows in a container.
[Inserted image, red used for convenience, rather than white:]
There is of course but one way in which there are ten whites in the top row; the balls of any one colour being for our purposes identical. But on shuffling, there are 63,504 ways to arrange five each of black and white balls in the two rows, and 6-4 distributions may occur in two ways, each with 44,100 alternatives.
So, if we for the moment see the set of balls as circulating among the various different possible arrangements at random, and spending about the same time in each possible state on average, the time the system spends in any given state will be proportionate to the relative number of ways that state may be achieved. Immediately, we see that the system will gravitate towards the cluster of more evenly distributed states. In short, we have just seen that there is a natural trend of change at random, towards the more thermodynamically probable macrostates, i.e the ones with higher statistical weights.
So “[b]y comparing the [thermodynamic] probabilities of two states of a thermodynamic system, we can establish at once the direction of the process that is [spontaneously] feasible in the given system. It will correspond to a transition from a less probable to a more probable state.” [p. 284.] This is in effect the statistical form of the 2nd law of thermodynamics.
Thus, too, the behaviour of the Clausius isolated system of A and B with d’Q of heat moving A –> B by reason of B’s lower temperature is readily understood:
First, -d’Q/T_a is of smaller magnitude than + d’Q/T_b, as T_b is less than T_a and both are positive values; so we see why if we consider the observed cosmos as an isolated system — something Sears and Salinger pointed out as philosophically loaded in their textbook, the one from which I first seriously studied these matters — then a transfer or energy by reason of temperature difference [i.e. heat] will net increase entropy [here, dS]. [Note as well the added diagram panel that shows a heat engine.]
Second, we bridge to the micro view if we see how importing d’Q of random molecular energy so far increases the number of ways energy can be distributed at micro-scale in B, that the resulting rise in B’s entropy swamps the fall in A’s entropy. That is, we have just lost a lot more information about B’s micro-state than we gained about A’s.
Moreover, given that FSCO/I-rich micro-arrangements [FSCO/I = “functionally specific complex organisation and/or associated information”] are relatively rare in the set of possible arrangements, we can also see why it is hard to account for the origin of such states by spontaneous processes in the scope of the observable universe.
[Insert, a favourite illustration of FSCO/I:]
[Where, just the gear teeth are already FSCO/I — observe the precise nodes-arcs mesh:]
(Of course, since it is as a rule very inconvenient to work in terms of statistical weights of macrostates [i.e W], we instead move to entropy, through [Boltzmann’s] s = k ln W or Gibbs’ more complex formulation. Part of how this is done can be seen by imagining a system in which there are W ways accessible, and imagining a partition into parts 1 and 2. W = W1*W2, as for each arrangement in 1 all accessible arrangements in 2 are possible and vice versa, but it is far more convenient to have an additive measure, i.e we need to go to logs. The constant of proportionality, k, is the famous Boltzmann constant and is in effect the universal gas constant, R, on a per molecule basis, i.e we divide R by the Avogadro Number, NA, to get: k = R/NA. The two approaches to entropy, by Clausius, and Boltzmann, of course, correspond. In real-world systems of any significant scale, the relative statistical weights are usually so disproportionate, that the classical observation that entropy naturally tends to increase, is readily apparent.)
Third, the diffusion model is a LINEAR space, a string structure.
This allows us to look at strings thermodynamically and statistically. Without losing force on the basic issue, let us consider the simplest case, equiprobability of position, with an alphabet of two possibilities [B vs. W balls]. Here, we see that special arrangements that may reflect strong order or organisation are vastly rarer in the set of possibilities than those that are near the peak of this distribution. For 1,000 balls, half B and half W, the peak is obviously going to be with the balls spread out in such a way that the next ball has 50-50 odds of being B or W, maximum uncertainty.
Now, let us follow L K Nash and Mandl, and go to a string of 1,000 coins or a string of paramagnetic elements in a weak field. (The latter demonstrates that the coin-string model is physically relevant.) [Insert illustration:]
We now have binary elements, and a binomial distribution with a field of binary digits, so we know there are 1.07 *10^301 possibilities from 000 . . . 0 to 111 . . . 1 inclusive. But if we cluster possibilities by proportions that are H and T, we see that there is a sharp peak near 500:500, and that by contrast there are much fewer possibilities as we approach 1,000:0 or 0:1,000. At the extremes, as the coins are identical, there is but one way each. Likewise for alternating H, T — a special arrangement, there are just two ways, H first, T first. [Insert illustration:]
[Cf. WUWT discussion, courtesy Google Images, here. Note the exercise in view: coin flips 4k – 6k H, +$1 mn, outside the band, your life. Would you take the bet? Why or why not? (Hint, it is not unrelated to see that the fluctuations around the mean are essentially unobservable beyond +/- 200 on 5,000, where the range is from 0 to 10,000. Where, SQRT (10,000) = 100. We here see a more or less empirical illustration of reasonably expected scales of fluctutations for a case like the binomial distribution. if N is of order 10^22, a typical number for molecules, we are seeing sqrt n of order 10^11, which will be unobservably sharp. [Cf. Discussion here.])]]
We now see how order accords with compressibility of description — more or less, algorithmic compressibility. To pick up one of the “typical” values near the peak, we essentially need to cite the string, while for the extremes, we need only give a brief description. this was Orgel’s point on info capacity being correlated with length of description string. Now, as we know Trevors and Abel in 2004 pointed out that code-bearing strings [or aperiodic functional ones otherwise] will resist compressibility but will be more compressible than the utterly flat random cases. This defines an island of function. [Insert, illustration:]
And we see that this is because any code or functionally specific string will naturally have in it some redundancy, there will not be a 50-50 even distribution in all cases.
There is a statistically dominant cluster, utterly overwhelmingly dominant, near 500-500 in no particular pattern or organised functional message-bearing framework. [As is illustrated.]
We can now come back to the entropy view, the peak is the high entropy, low information case. That is, if we imagine some nano-bots that can rearrange coin patterns, if they act at random, they will utterly likely produce the near-500-500 no particular order result. But now, if we instruct them with a short algorithm, they can construct all H or all T, or we can give them instructions to do HT-HT . . . etc.
Or, we can feed in ASCII code or some other description language based information.
It is conceivable that the robots could generate such codes by chance, but the degree of isolation in the space of possibilities is such that effectively these are unobservable on the scale of the observed cosmos. As, a blind random search of the space of possibilities will be maximally unlikely to hit on the highly informational patterns.
[Insert, on islands of function is a search space:]
[ –> Notice, the illustration of the implications of cumulative, step by step causally connected stages dominated by a random walk in a config space dominated by seas of non-function such that hill-climbing reinforcement cannot kick in until one hits an island of function. Where, such are quite rare in the overall space. Make the search hard enough and islands of function are unobservable. For Sol system scale, that kicks in at ~ 500 bits, and for the observable cosmos, 1,000.]
It does not matter if we were to boost the robot energy levels and speed them up to a maximum reasonable speed, that of molecular interactions and it does not matter if in effect the 10^80 atoms of the observed cosmos were given con strings and robots to flip so the strings could be flipped and read 10^12 – 14 times per s for 10^17s. Which is the sort of gamut we have available.
We can confidently infer that if we see a string of 1,000 coins in a meaningful ordered or organised pattern, they were put that way by intelligently directed work, based on information. By direct import of the statistical thermodynamic reasoning we have been using.
That is, we here see the basis for the confident and reliable inference to design on seeing FSCO/I.
Going further, we can see that codes includes descriptions of functional organisation as per AutoCAD etc, and that such can specify any 3-d organisation of components that is functional. Where also, we can readily follow the instructions using a von Neumann universal constructor facility [make it to be self replicating and done, too] and test for observable function. Vary the instructions at random, and we soon enough see where the limits of an island of function are as function ceases.
Alternatively, we can start with a random string, and then allow our nanobots to assemble. If something works, we preserve and allow further incremental, random change.
That is, we have — as a thought exercise — an evolutionary informatics model.
And, we have seen how discussion on strings is without loss of generality, as strings can describe anything else of relevance and such descriptions can be actualised as 3-d entities through a universal constructor. Which can be self-replicating, thus the test extends to evolution. (And yes, this also points tot he issue of the informational description of the universal constructor and self replication facility as the first threshold to be passed. Nor is this just a mind-game, the living cell is exactly this sort of thing, through perhaps not yet a full bore universal constructor. [Give us a couple of hundred years to figure that out and we will likely have nanobot swarms that will be just that!])
The inference at this point is obvious: by the utter dominance of non-functional configurations, 500 – 1,000 bits of information is a generous estimate of the upper limit for blind mechanisms to find functional forms.
This then extends directly into looking at the genome and to the string length of proteins as an index of find-ability, thence the evaluation of plausibility of origin of life and body plan level macro-evo models.
Origin of life by blind chance and/or mechanical necessity is utterly implausible. Minimal genomes are credibly 100 – 1,000 k bases, corresponding to about 100 times the size of the upper threshold.
Origin of major body plans, similarly, reasonably requires some 10 – 100+ mn new bases. We are now 10 – 100 thousand times the threshold.
Inference: the FSCO/I in first cell based life is there by design. Likewise that in novel body plans up to our own.
And, such is rooted in the informational context of such life.>>
___________
I think this allows us to further discuss. END