Over the past several days, there has been considerable debate at UD on thermodynamics, information, order vs disorder etc. In a clarifying note to Mung (who was in turn responding to Sal C) I have commented as follows. (My note also follows up from an earlier note that was put up early in the life of the recent exchanges here, and a much earlier ID Foundations series post on counter-flow and the thermodynamics FSCO/I link.)

I think it convenient to scoop the below out for record and reference, as across time comments in threads are much harder to find than original posts:

_____________________

>>One more time [cf. 56 above, which clips elsewhere . . . ], let me clip Shannon, 1950/1:

Theentropyis a statistical parameter which measures, in a certain sense, how much information is producedon the averagefor each letter. If the language is translated into binary digits (0 or 1) in the most efficient way, the entropy is the average number of binary digits required per letter of the original language. The redundancy, on the other hand, measures the amount of constraint imposed on a text in the language due to its statistical structure, e.g., in English the high fre-quency of the letter E, the strong tendency of H to follow T or of V to follow Q [of a text in the languagesic, probably V is a typo for U] . It was estimated that when statistical effects extending over not more than eight letters are considered the entropy is roughly 2.3 bits per letter, the redundancy about 50 per cent.

Going back to my longstanding, always linked note, which I have clipped several times over the past few days, here on is how we measure info and avg info per symbol:

To quantify the above definition of what is perhaps best descriptively termed information-carrying capacity, but has long been simply termed information (in the “Shannon sense” – never mind his disclaimers . . .), let us consider a source that emits symbols from a vocabulary: s1,s2, s3, . . . sn, with probabilities p1, p2, p3, . . . pn. That is, in a “typical” long string of symbols, of size M [say this web page], the average number that are some sj, J, will be such that the ratio J/M –> pj, and in the limit attains equality. We term pj the a priori — before the fact — probability of symbol sj. Then, when a receiver detects sj, the question arises as to whether this was sent. [That is, the mixing in of noise means that received messages are prone to misidentification.] If on average, sj will be detected correctly a fraction, dj of the time, the a posteriori — after the fact — probability of sj is by a similar calculation, dj. So, we now define the information content of symbol sj as, in effect how much it surprises us on average when it shows up in our receiver:

I = log [dj/pj], in bits [if the log is base 2, log2] . . . Eqn 1This immediately means that the question of receiving information arises AFTER an apparent symbol sj has been detected and decoded. That is, the issue of information inherently implies an inference to having received an intentional signal in the face of the possibility that noise could be present. Second, logs are used in the definition of I, as they give an additive property: for, the amount of information in independent signals, si + sj, using the above definition, is such that:

I total = Ii + Ij . . . Eqn 2

For example, assume that dj for the moment is 1, i.e. we have a noiseless channel so what is transmitted is just what is received. Then, the information in sj is:

I = log [1/pj] = – log pj. . . Eqn 3This case illustrates the additive property as well, assuming that symbols si and sj are independent. That means that the probability of receiving both messages is the product of the probability of the individual messages (pi *pj); so:

Itot = log1/(pi *pj) = [-log pi] + [-log pj] = Ii + Ij . . . Eqn 4

So if there are two symbols, say 1 and 0, and each has probability 0.5, then for each, I is – log [1/2], on a base of 2, which is 1 bit. (If the symbols were not equiprobable, the less probable binary digit-state would convey more than, and the more probable, less than, one bit of information. Moving over to English text, we can easily see that E is as a rule far more probable than X, and that Q is most often followed by U. So, X conveys more information than E, and U conveys very little, though it is useful as redundancy, which gives us a chance to catch errors and fix them: if we see “wueen” it is most likely to have been “queen.”)

Further to this,

we may average the information per symbol in the communication system thusly (giving in termns of -H to make the additive relationships clearer):

– H = p1 log p1 + p2 log p2 + . . . + pn log pn

or, H = – SUM [pi log pi] . . . Eqn 5

H, the average information per symbol transmitted [usually, measured as: bits/symbol], is often termed the Entropy; first, historically, because it resembles one of the expressions for entropy in statistical thermodynamics. As Connor notes: “it is often referred to as the entropy of the source.” [p.81, emphasis added.]Also, while this is a somewhat controversial view in Physics, as is briefly discussed in Appendix 1below, there is in fact an informational interpretation of thermodynamics that shows that informational and thermodynamic entropy can be linked conceptually as well as in mere mathematical form . . .

What this last refers to is the Gibbs formulation of entropy for statistical mechanics, and its implications when the relationship between probability and information is brought to bear in light of the Macro-micro views of a body of matter. That is, when we have a body, we can characterise its state per lab-level thermodynamically significant variables, that are reflective of many possible ultramicroscopic states of constituent particles.

Thus, clipping again from my always linked discussion that uses Robertson’s Statistical Thermophysics, CH 1 [and do recall my strong recommendation that we all acquire and read L K Nash’s elements of Statistical Thermodynamics as introductory reading):

Summarising Harry Robertson’s Statistical Thermophysics (Prentice-Hall International, 1993) . . . .

For, as he astutely observes on pp. vii – viii:

. . . the standard assertion that molecular chaos exists is nothing more than a poorly disguised admission of ignorance, or lack of detailed information about the dynamic state of a system . . . . If I am able to perceive order, I may be able to use it to extract work from the system, but if I am unaware of internal correlations, I cannot use them for macroscopic dynamical purposes. On this basis, I shall distinguish heat from work, and thermal energy from other forms . . .

And, in more details, (pp. 3 – 6, 7, 36, cf Appendix 1 below for a more detailed development of thermodynamics issues and their tie-in with the inference to design; also see recent ArXiv papers by Duncan and Samura here and here):

. . . It has long been recognized that the assignment of probabilities to a set represents information, and that some probability sets represent more information than others . . . if one of the probabilities say p2 is unity and therefore the others are zero, then we know that the outcome of the experiment . . . will give [event] y2. Thus we have complete information . . . if we have no basis . . . for believing that event yi is more or less likely than any other [we] have the least possible information about the outcome of the experiment . . . . A remarkably simple and clear analysis by Shannon [1948] has provided us with a quantitative measure of the uncertainty, or missing pertinent information, inherent in a set of probabilities [NB: i.e. a probability different from 1 or 0 should be seen as, in part, an index of ignorance] . . . .

[deriving informational entropy, cf. discussions here, here, here, here and here; also Sarfati’s discussion of debates and the issue of open systems here . . . ]

H({pi}) = – C [SUM over i] pi*ln pi, [. . . “my” Eqn 6][–> This is essentially the same as Gibbs Entropy, once C is properly interpreted and the pi’s relate to the probabilities of microstates consistent with the given lab-observable macrostate of a system at a given Temp, with a volume V, under pressure P, degree of magnetisation, etc etc . . . ]

[where [SUM over i] pi = 1, and we can define also parameters alpha and beta such that: (1) pi = e^-[alpha + beta*yi]; (2) exp [alpha] = [SUM over i](exp – beta*yi) = Z [Z being in effect the partition function across microstates, the “Holy Grail” of statistical thermodynamics]. . . .

[H], called the information entropy, . . . correspond[s] to the thermodynamic entropy [i.e. s, where also it was shown by Boltzmann that s = k ln w], with C = k, the Boltzmann constant, and yi an energy level, usually ei, while [BETA] becomes 1/kT, with T the thermodynamic temperature . . .

A thermodynamic system is characterized by a microscopic structure that is not observed in detail . . . We attempt to develop a theoretical description of the macroscopic properties in terms of its underlying microscopic properties, which are not precisely known. We attempt to assign probabilities to the various microscopic states . . . based on a few . . . macroscopic observations that can be related to averages of microscopic parameters. Evidently the problem that we attempt to solve in statistical thermophysics is exactly the one just treated in terms of information theory. It should not be surprising, then, that the uncertainty of information theory becomes a thermodynamic variable when used in proper context. . . .Jayne’s [summary rebuttal to a typical objection] is “. . . The entropy of a thermodynamic system is a measure of the degree of ignorance of a person whose sole knowledge about its microstate consists of the values of the macroscopic quantities . . . which define its thermodynamic state. This is a perfectly ‘objective’ quantity . . . it is a function of [those variables] and does not depend on anybody’s personality. There is no reason why it cannot be measured in the laboratory.” . . . . [pp. 3 – 6, 7, 36; replacing Robertson’s use of S for Informational Entropy with the more standard H.]

As is discussed briefly in Appendix 1, Thaxton, Bradley and Olsen [TBO], following Brillouin et al, in the 1984 foundational work for the modern Design Theory, The Mystery of Life’s Origins [TMLO], exploit this information-entropy link, through the idea of moving from a random to a known microscopic configuration in the creation of the bio-functional polymers of life, and then — again following Brillouin — identify a quantitative information metric for the information of polymer molecules. For, in moving from a random to a functional molecule, we have in effect an objective, observable increment in information about the molecule. This leads to energy constraints, thence to a calculable concentration of such molecules in suggested, generously “plausible” primordial “soups.” In effect, so unfavourable is the resulting thermodynamic balance, that the concentrations of the individual functional molecules in such a prebiotic soup are arguably so small as to be negligibly different from zero on a planet-wide scale.

By many orders of magnitude, we don’t get to even one molecule each of the required polymers per planet, much less bringing them together in the required proximity for them to work together as the molecular machinery of life. The linked chapter gives the details. More modern analyses [e.g. Trevors and Abel, here and here], however, tend to speak directly in terms of information and probabilities rather than the more arcane world of classical and statistical thermodynamics . . .

Now, of course, as Wiki summarises, the classic formulation of the Gibbs entropy is:

The macroscopic state of the system is defined by a distribution on the microstates that are accessible to a system in the course of its thermal fluctuations. So the entropy is defined over two different levels of description of the given system. The entropy is given by the Gibbs entropy formula, named after J. Willard Gibbs. For a classical system (i.e., a collection of classical particles) with a discrete set of microstates, if E_i is the energy of microstate i, and p_i is its probability that it occurs during the system’s fluctuations, then the entropy of the system is:

S = -k_B * [sum_i] p_i * ln p_iThis definition remains valid even when the system is far away from equilibrium. Other definitions assume that the system is in thermal equilibrium, either as an isolated system, or as a system in exchange with its surroundings. The set of microstates on which the sum is to be done is called a statistical ensemble. Each statistical ensemble (micro-canonical, canonical, grand-canonical, etc.) describes a different configuration of the system’s exchanges with the outside, from an isolated system to a system that can exchange one more quantity with a reservoir, like energy, volume or molecules. In every ensemble, the equilibrium configuration of the system is dictated by the maximization of the entropy of the union of the system and its reservoir, according to the second law of thermodynamics (see the statistical mechanics article).

Neglecting correlations between the different possible states (or, more generally, neglecting statistical dependencies between states) will lead to an overestimate of the entropy[1]. These correlations occur in systems of interacting particles, that is, in all systems more complex than an ideal gas.

This S is almost universally called simply the entropy. It can also be called the statistical entropy or the thermodynamic entropy without changing the meaning.

Note the above expression of the statistical entropy is a discretized version ofThe von Neumann entropy formula is an extension of the Gibbs entropy formula to the quantum mechanical case.Shannon entropy.It has been shown that the Gibb’s Entropy is numerically equal to the experimental entropy[2] dS = delta_Q/{T} . . .

Looks to me that this is one time Wiki has it just about dead right. Let’s deduce a relationship that shows physical meaning in info terms, where (- log p_i) is an info metric, I-i, here for microstate i, and noting that a sum over i of p_i * log p_i is in effect a frequency/probability weighted average or the expected value of the log p_i expression, and also moving away from natural logs (ln) to generic logs:

S_Gibbs = -k_B * [sum_i] p_i * log p_i

But, I_i = – log p_i

So, S_Gibbs = k_B * [sum_i] p_i *

I-ii.e. S-Gibbs is a constant times the average information required to specify the particular microstate of the system, given its macrostate, the MmIG (macro-micro info gap.

Or, as Wiki also says elsewhere:

At an everyday practical level the links between information entropy and thermodynamic entropy are not close. Physicists and chemists are apt to be more interested in changes in entropy as a system spontaneously evolves away from its initial conditions, in accordance with the second law of thermodynamics, rather than an unchanging probability distribution. And, as the numerical smallness of Boltzmann’s constant kB indicates, the changes in S / kB for even minute amounts of substances in chemical and physical processes represent amounts of entropy which are so large as to be right off the scale compared to anything seen in data compression or signal processing.

But, at a multidisciplinary level, connections can be made between thermodynamic and informational entropy, although it took many years in the development of the theories of statistical mechanics and information theory to make the relationship fully apparent. In fact, in the view of Jaynes (1957), thermodynamics should be seen as an application of Shannon’s information theory:

the thermodynamic entropy is interpreted as being an estimate of the amount of further Shannon information needed to define the detailed microscopic state of the system, that remains uncommunicated by a description solely in terms of the macroscopic variables of classical thermodynamics.For example, adding heat to a system increases its thermodynamic entropy because it increases the number of possible microscopic states that it could be in, thus making any complete state description longer. (See article: maximum entropy thermodynamics.[Also,another article remarks: >>in the words of G. N. Lewis writing about chemical entropy in 1930, “Gain in entropy always means loss of information, and nothing more” . . .in the discrete case using base two logarithms, the reduced Gibbs entropy is equal to the minimum number of yes/no questions that need to be answered in order to fully specify the microstate, given that we know the macrostate.>>]) Maxwell’s demon can (hypothetically) reduce the thermodynamic entropy of a system by using information about the states of individual molecules; but, as Landauer (from 1961) and co-workers have shown, to function the demon himself must increase thermodynamic entropy in the process, by at least the amount of Shannon information he proposes to first acquire and store; and so the total entropy does not decrease (which resolves the paradox).

So, immediately, the use of “entropy” in the Shannon context, to denote not H but N*H, where N is the number of symbols (thus, step by step states emitting those N symbols involved), is an error of loose reference.

Similarly, by exploiting parallels in formulation and insights into the macro-micro distinction in thermodynamics, we can develop a reasonable and empirically supportable physical account of how Shannon information is a component of the Gibbs entropy narrative. Where also Gibbs subsumes the Boltzmann formulation and onward links to the lab-measurable quantity. (Nash has a useful, relatively lucid — none of this topic is straightforward — discussion on that.)

Going beyond, once the bridge is there between information and entropy, it is there. It is not going away, regardless of how inconvenient it may be to some schools of thought.

We can easily see that, for example, information is expressed in the configuration of a string, Z, of elements z1 -z2 . . . zN in accordance with a given protocol of assignment rules and interpretation & action rules etc.

Where also, such is WLOG as AutoCAD etc show us that using the nodes and arcs representation and a list of structured strings that record this, essentially any object can be described in terms of a suitably configured string or collection of strings.

So now, we can see that string Z (with each zi possibly taking b discrete states) may represent an island of function that expresses functionally specific complex organisation and associated information. Because of specificity to achieve and keep function, leading to a demand for matching, co-ordinated values of zi along the string, that string has relatively few of the [oops] b^N possibilities for N elements with b possible states being permissible. We are at isolated islands of specific function i.e cases E from a zone of function T in a space of possibilities W.

(BTW, once b^N exceeds the config space of 500 bits on the gamut of our solar system, or 1,000 bits on the gamut of our observable cosmos, that brings to bear all the needle in the haystack, monkeys at keyboards analysis that has been repeatedly brought forth to show why FSCO/I is a useful sign of IDOW — intelligently directed organising work — as empirically credible cause.)

We see then that we have a complex string to deal with, with sharp restrictions on possible configs, that are evident from observable function, relative to the general possibility of W = b^N possibilities. *Z is in a highly informational, tightly constrained state that comes from a special zone specifiable on macro-level observable function (without actually observing Z directly). That constraint on degrees of freedom contingent on functional, complex organisation, is tantamount to saying that a highly informational state is a low entropy one, in the Gibbs sense.*

Going back to the expression, *comparatively speaking there is not a lot of MISSING micro-level info to be specified*, i.e. **simply by knowing the fact of complex specified information-rich function, we know that we are in a highly restricted special Zone T in W**. This immediately applies to R/DNA and proteins, which of course use string structures. It also applies tot he complex 3-D arrangement of components in the cell, which are organised in ways that foster function.

And of course it applies to the 747 in a flyable condition.

Such easily explains why a tornado passing through a junkyard in Seattle will not credibly assemble a 747 from parts it hits, and it explains why the raw energy and forces of the tornado that hits another formerly flyable 747, and tearing it apart, would render its resulting condition much less specified per function, and in fact result in predictable loss of function.

We will also see that this analysis assumes the functional possibilities of a mass of Al, but is focussed on the issue of functional config and gives it specific thermodynamics and information theory context. (Where also, algebraic modelling is a valid mathematical analysis.)

I trust this proves helpful. >>

I also post scripted:

>> The most probable or equilibrium cluster of microstates consistent with a given macrostate, is the cluster that has the least information about it, and the most freedom of variation of mass and energy distribution at micro level. This high entropy state-cluster is strongly correlated with high levels of disorder, for reasons connected to the functionality constraints just above. And in fact — never mind those who are objecting and pretending that this is not so — it is widely known in physics that entropy is a metric of disorder, some would say it quantifies it and gives it structured physical expression in light of energy and randomness or information gap considerations. >>

_____________________

So, I think it reasonable to associate higher and higher entropy states of a given body or collection of material objects with increasingly disordered configurations, and that there is a bridge between Shannon Info and the H-metric of avg info per symbol, and the Gibbs entropy formulation, which is more fundamental than that which is used in classical formulations of thermodynamics. In turn, this is connected to the issues of how FSCO/I is an index not merely of order but organisation, and how such an information-rich state is one in which there is comparatively low uncertainty about the zone of possible configs, i.e it is a low entropy state relative to the sea of non-functional possible configs of the same components.

As I have said, this area is quite technical, and I strongly recommend the L K Nash Book, Elements of Statistical Thermodynamics as a well put together introduction. In fact, its treatment of the Boltzmann distribution alone (with the associated drawing out of quantities, functions and relationships in physical context) well is worth the price. I don’t know if the “classical books” publisher, Dover, could be induced to do an ebook edition. (U/D Sep 10: Thanks to KD, I suggest Fitzpatrick’s freely downloadable course notes, Thermodynamics and Statistical mechanics, here. Nash’s presentation is still the more intuitively clear.)

Harry S Robertson’s Statistical Thermophysics is a good modern presentation on the informational school of thought on thermodynamics, but can hardly be said to be an introductory level treatment for the first time reader on this topic, so work through Nash first to get your grounding.

I trust the above will help us all clarify our discussion on this important though admittedly difficult topic. **END**

F/N: The above is a summary response on the question of the bridge between information (thus configuration) and entropy in the thermodynamic sense, in response to attempts to sever the bridge. Note the onward debate points I respond to in my comment here by linking the above. KF

F/N 2: Enthalpy — and sorry, the same symbol is being used again with a different context of meaning,

H = U + PV

Where it can be shown that

dH (S, p) = TdS + Vdp [+ chem potential terms etc not relevant here]

So, if dp = 0:

dH = T*dS

Now, consider a 1 kg block of ice at 0 deg C. Let it melt to form water at the same temp, under the same usual pressure conds etc.

Latent heat of fusion is 334 * 10^3 J/kg

Where has this gone, as dH is also d’Q, increment in heat content under that sort of condition?

T being constant (and other issues being irrelevant for our purposes), dH has effectively gone into increment in entropy to melt the ice.

That is into the loss of order in the crystalline form, as it melts. Since ice is at 273.15 K, we have just added:

Delta_S = 334 * 10^3 J / 273.15 K = 1.23 *10^3 J/K of entropy

This is an example of the [in this case calculable] connexion between entropy increase and disorder also reduced specificity on the microstate of a system; here specifically linked to configuration.

KF

But what *is* entropy in the thermodynamic sense? Can we even answer that using classical thermodynamics?

(1) What is entropy?

http://www.ariehbennaim.com/de.....-Pref-.pdf

Mung:

That is the problem,

classical thermodynamics does not lift the hood/bonnet on what is going on, it is about macro-variables‘I wuk, but Why?Hence, stat thermo-D, thence Boltzmann and onward the Gibbs-Shannon issue. (Nash is really good, about as close to “simple” as this stuff gets. Then, you can begin to make sense of the rest.)

On that bridge, entropy measures the MmIG: the macro-micro info gap to specify actual microstate given macrostate.

That gap/freedom means that we cannot exploit detailed knowledge of microstates to get work out of say thermal energy.

That is what Maxwell’s demon is about; by knowing actual particle behaviour he can create a hot and cold compartment, undoing diffusion, and allowing free work:

|| .*.*.* || –> Demon gates –> || * * * (hot) | . . . (cold) ||

The demon spots hot molecules going left or cold — slow ones — going right, and so effects a separation.

Couple a heat-using engine to the two sides and voila we have free energy to do a free energy engine!

Of course the trick is that there is a cost to know the hot/cold and to gate the transfers, so we are back to paying for what we get, plus paying friction etc overheads.

Hence the significance of MmIG.

KF

Entropy Is Not Disorder

Mung:

Entropy is sufficiently frequently associated with the breakdown of order that it has aptly been described as time’s arrow. That is, the spontaneous direction of change of accessible microstates is towards predominant groups that are maximally unconstrained.

And, Gell-Mann is giving much the same answer as has been repeatedly posed.

The Gibbs metric can be seen as measuring the set of yes/no questions [= bits] required on average to specify the particular microstate, given a macrostate. AutoCAD stands in mute testimony to the way that such a structured string of Y/N questions can describe the config of a system.

And there is no reason for us to think that scaling up the system to visible size changes the fundamental issue of config spaces of possible arrangements of components, and how rare in the space (and usually, how un-obvious . . . as in a criterion of intellectual property these days) is a functionally specific complex state.

That is why a tornado hitting an airport in Seattle would be liable to destroy a 747 sitting there, and would not be at all likely to — by passing on to hitting the Boeing plant — assemble another from components lying around.

That difference is a key matter that the analysis behind the second law teaches us and we should not allow us to lose sight of it, whatever odd cases there are that show that entropy may not always be directly tied to disordering.

BTW, your linked case is fudging. HEAT and temperature are directly connected to disorder, as in temp is a measure of avg random energy per degree of freedom of particles [atoms, molecules, etc] at micro-level.

In that context, the point of the 2nd law is that, per Carnot’s rule, high degrees of molecular level random motion cannot wholly be converted into shaft work, i.e orderly motion of an engine. Some waste heat has to be exhausted to a cold reservoir. By contrast, friction tells us that orderly motion can be wholly converted into heat, which being random motion, can be properly described as disorderly.

It is sad to see the description looking the obvious answer in the face, and then backing away: YES, we may not be able to turn this into a calculation in detail just now, but we can indeed address differing levels of a system. So we could in principle carry out an analysis of what is going on at differing levels; noting that we are in practical cases counting up COMPONENTS and CHANGES in entropy.

He goes on to a declaration of school allegiance:

This is of course the very opposite of the significance of the Gibbs metric as has been highlighted: the MmIG. Witht he gap in place, we have no basis for playing Maxwell’s demon and exploiting micro-level behaviour to extract work as in 6 above:

And failure to see the diverse components involved, leads to the sort of misdirected assertions made on cases.

For instance, the dessicated corpse was once a living breathing human being, and the loss of constraint on states due to the self-maintaining activities of living cells is what resulted in the decay of death. The component atoms have reverted to less and less constrained states and eventually will go to dust. “Ashes to ashes, dust to dust.”

Something is deeply wrong and the disagreements and anomalies in the way many speak of entropy are pointing to the core of the problem: a failure of coherence in analysis and absence of a

generally acceptedelegant solution.KF

kf,

I think you’ve taken him out of context. I don’t think he’s declaring allegiance to any school.

I think his third sentence, which you quoted, should be read in the light of the two previous sentences. Replace the word “observation” with “our perception of order or disorder. I think that’s all he was saying.

Even then I’m not sure what your objection is.

Do you disagree with him that:

Pressure is an independent thermodynamic property of the system that does not depend on our observation.

Temperature is an independent thermodynamic property of the system that does not depend on our observation.

I think that all he is saying is that in order to say that a system is more or less ordered, more or less disordered, requires someone to observe the state of the system and make a judgement call.

There is no equation that tells you how ordered a system is.

Mung: The problem is that the entropy of the system per Gibbs and via the Shannon bridge is a metric of avg missing info to specify microstate given macrostate. Which is about degrees of freedom, which can be a metric of disorder. That is an objective quantity but it is also implicitly about subjects, as in information and procedure. KF