Uncommon Descent Serving The Intelligent Design Community

Functional information defined

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

What is function? What is functional information? Can it be measured?

Let’s try to clarify those points a little.

Function is often a controversial concept. It is one of those things that everybody apparently understands, but nobody dares to define. So it happens that, as soon as you try to use that concept in some reasoning, your kind interlocutor immediately stops you at the beginning, with the following smart request: “Yes, but what is function? How can you define it?

So, I will try to define it.

A premise. As we are not debating philosophy, but empirical science, we need to remain adherent to what can be observed. So, in defining function, we must stick to what can be observed: objects and events, in a word facts.

That’s what I will do.

But as usual I will include, in my list of observables, conscious beings, and in particular humans. And all the observable processes which take place in their consciousness, including the subjective experiences of understanding and purpose. Those things cannot be defined other than as specific experiences which happen in a conscious being, and which we all understand because we observe them in ourselves.

That said, I will try to begin introducing two slightly different, but connected, concepts:

a) A function (for an object)

b) A functionality (in a material object)

I define a function for an object as follows:

a) If a conscious observer connects some observed object to some possible desired result which can be obtained using the object in a context, then we say that the conscious observer conceives of a function for that object.

b) If an object can objectively be used by a conscious observer to obtain some specific desired result in a certain context, according to the conceived function, then we say that the object has objective functionality, referred to the specific conceived function.

The purpose of this distinction should be clear, but I will state it explicitly just the same: a function is a conception of a conscious being, it does not exist  in the material world outside of us, but it does exist in our subjective experience. Objective functionalities, instead, are properties of material objects. But we need a conscious observer to connect an objective functionality to a consciously defined function.

Let’s make an example.

Stones

I am a conscious observer. At the beach, I see various stones. In my consciousness, I represent the desire to use a stone as a chopping tool to obtain a specific result (to chop some kind of food). And I choose one particular stone which seems to be good for that.

So we have:

a) The function: chopping food as desired. This is a conscious representation in the observer, connecting a specific stone to the desired result. The function is not in the stone, but in the observer’s consciousness.

b) The functionality in the chosen stone: that stone can be used to obtain the desired result.

So, what makes that stone “good” to obtain the result? Its properties.

First of all, being a stone. Then, being in some range of dimensions and form and hardness. Not every stone will do. If it is too big, or too small, or with the wrong form, etc., it cannot be used for my purpose.

But many of them will be good.

So, let’s imagine that we have 10^6 stones on that beach, and that we try to use each of them to chop some definite food, and we classify each stone for a binary result: good – not good, defining objectively how much and how well the food must be chopped to give a “good” result. And we count the good stones.

I call the total number of stones: the Search space.

I call the total number of good stones: the Target space

I call –log2 of the ratio Target space/Search space:  Functionally Specified Information (FSI) for that function in the system of all the stones I can find in that beach. It is expressed in bits, because we take -log2 of the number.

So, for example, if 10^4 stones on the beach are good, the FSI for that function in that system is –log2 of 10^-2, that is  6,64386 bits.

What does that mean? It means that one stone out of 100 is good, in the sense we have defined, and if we choose randomly one stone in that beach we have a probability to find a good stone of 0.01 (2^-6,64386).

I hope that is clear.

So, the general definitions:

c) Specification. Given a well defined set of objects (the search space), we call “specification”, in relation to that set, any explicit objective rule that can divide the set in two non overlapping subsets:  the “specified” subset (target space) and the “non specified” subset.  IOWs, a specification is any well defined rule which generates a binary partition in a well defined set of objects.

d) Functional Specification. It is a special form of specification (in the sense defined above), where the rule that specifies is of the following type:  “The specified subset in this well defined set of objects includes all the objects in the set which can implement the following, well defined function…” .  IOWs, a functional specification is any well defined rule which generates a binary partition in a well defined set of objects using a function defined as in a) and verifying if the functionality, defined as in b), is present in each object of the set.

It should be clear that functional specification is a definite subset of specification. Other properties, different from function, can in principle be used  to specify. But for our purposes we will stick to functional specification, as defined here.

e) The ratio Target space/Search space  expresses the probability of getting an object from the search space by one random search attempt, in a system where each object has the same probability of being found by a random search (that is, a system with an uniform probability of finding those objects).

f) The Functionally Specified  Information  (FSI)  in bits is simply –log2 of that number. Please, note that I  imply  no specific  meaning of the word “information” here. We could call it any other way. What I mean is exactly what I have defined, and nothing more.

One last step. FSI is a continuous numerical value, different for each function and system.  But it is possible to categorize  the concept in order to have a binary variable (yes/no) for each function in a system.

So, we define a threshold (for some specific  system of objects). Let’s say 30 bits.  We compute different values of FSI for many different functions which can be conceived for the objects in that system. We say that those functions which have a value of FSI above the threshold we have chosen (for example, more than 30 bits) are complex. I will not discuss here how the threshold is chosen, because that is part of the application of these concepts to the design inference, which will be the object of another post.

g) Functionally Specified Complex Information is therefore a binary property defined for a function in a system by a threshold. A function, in a specific system, can be “complex” (having  FSI above the threshold). In that case, we say that the function implicates FSCI in that system, and if an object observed in that system implements that function we say that the object exhibits FSCI.

h) Finally, if the function for which we use our objects is linked to a digital sequence which can be read in the object, we simply speak of digital FSCI: dFSCI.

So, FSI is a subset of SI, and dFSI is a subset of FSI. Each of these can be expressed in categorical form (complex/non complex).

Some final notes:

1) In this post, I have said nothing about design. I will discuss in a future post how these concepts can be used for a design inference, and why dFSCI is the most useful concept to infer design for biological information.

2) As you can see, I have strictly avoided to discuss what information is or is not. I have used the word for a specific definition, with no general implications at all.

1030743_72733179

3) Different functionalities for different functions can be defined for the same object or set of objects. Each function will have different values of FSI. For example, a tablet computer can certainly be used as a paperweight. It can also be used to make complex computations. So, the same object has different functionalities. Obviously, the FSI will be very different for the two functions: very low for the paperweight function (any object in that range of dimensions and weight will do), and very high for the computational function (it’s not so easy to find a material object that can work as a computer).

OLYMPUS DIGITAL CAMERA

4) Although I have used a conscious observer to define function, there is no subjectivity in the procedures. The conscious observer can define any possible function he likes. He is absolutely free. But he has to define objectively the function, and how to measure the functionality, so that everyone can objectively verify the measurement. So, there is no subjectivity in the measurements, but each measurement is referred to a specific function, objectively defined by a subject.

Comments
I'm amazed that Darwinists still argue over the probabilities of a novel protein forming by chance.
Stephen Meyer Critiques Richard Dawkins's "Mount Improbable" Illustration - video https://www.youtube.com/watch?v=7rgainpMXa8 Estimating the prevalence of protein sequences adopting functional enzyme folds: Doug Axe: Excerpt: The prevalence of low-level function in four such experiments indicates that roughly one in 10^64 signature-consistent sequences forms a working domain. Combined with the estimated prevalence of plausible hydropathic patterns (for any fold) and of relevant folds for particular functions, this implies the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 10^77, adding to the body of evidence that functional folds require highly extraordinary sequences. http://www.toriah.org/articles/axe-2004.pdf The Case Against a Darwinian Origin of Protein Folds - Douglas Axe - 2010 Excerpt Pg. 11: "Based on analysis of the genomes of 447 bacterial species, the projected number of different domain structures per species averages 991. Comparing this to the number of pathways by which metabolic processes are carried out, which is around 263 for E. coli, provides a rough figure of three or four new domain folds being needed, on average, for every new metabolic pathway. In order to accomplish this successfully, an evolutionary search would need to be capable of locating sequences that amount to anything from one in 10^159 to one in 10^308 possibilities, something the neo-Darwinian model falls short of by a very wide margin." http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.1 The Case Against a Darwinian Origin of Protein Folds - Douglas Axe, Jay Richards - audio http://intelligentdesign.podomatic.com/player/web/2010-05-03T11_09_03-07_00 Biologist Douglas Axe on evolution's ability to produce new functions - video https://www.youtube.com/watch?v=8ZiLsXO-dYo "It turns out once you get above the number six [changes in amino acids] -- and even at lower numbers actually -- but once you get above the number six you can pretty decisively rule out an evolutionary transition because it would take far more time than there is on planet Earth and larger populations than there are on planet Earth." Douglas Axe The Evolutionary Accessibility of New Enzyme Functions: A Case Study from the Biotin Pathway - Ann K. Gauger and Douglas D. Axe - April 2011 Excerpt: We infer from the mutants examined that successful functional conversion would in this case require seven or more nucleotide substitutions. But evolutionary innovations requiring that many changes would be extraordinarily rare, becoming probable only on timescales much longer than the age of life on earth. http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2011.1/BIO-C.2011.1
bornagain77
May 10, 2014
May
05
May
10
10
2014
04:12 AM
4
04
12
AM
PDT
Therefore, it is clear that not only the superfamily was already present in LUCA, but that it was already diversified at least in these two different proteins, always in LUCA.
It depends on how you define LUCA. It you take only vertical descent of organisms into account, LUCA = the last common ancestor of Bacteria and Archaea. If you look at the lineages of genes instead and do not ignore horizontal transfer, LUCA will be older (it will be the hypothetical organism in which the lineages of genes finally coalesce). Modelling the evolution of prokaryotes as a neat family tree is unrealistic -- the actual genealogies form a "tangled bush", and there will be frequent mismatches between family trees of genes and organism (not to mention complications introduced by endosymbiosis). You are partly confusing the two notions of shared ancestry yourself. Note that you say that related genes had diverged "already in LUCA". That means that LUCA had an ancestor in which both genes had a single source. I want to make the distinction clear to avoid quibbling about terminology.Piotr
May 10, 2014
May
05
May
10
10
2014
02:02 AM
2
02
02
AM
PDT
wd400:
Yes. I don’t think there is any empirical evidence either way. Requiring such is to play a silly game, and pretending you can conclude anything after assuming away precursors LUCA would be too.
I require nothing and I pretend nothing. We have few facts, and as you say they are not "empirical evidence" of nothing, because they are not enough. My point is that my hypothesis (LUCA is also FUCA) explains the little we know better, and it is simpler and more straightforward, given those few facts. As new facts become known (and they will), we can see what happens. Consider my hypothesis as a prediction.gpuccio
May 10, 2014
May
05
May
10
10
2014
12:20 AM
12
12
20
AM
PDT
Piotr:
Hey, so you still mean the double negation? I’m beginning to suspect a Freudian slip: you realise you are wrong and you’re unconsciously trying to contradict yourself. Be careful, or I’ll take you at your word.
Maybe. But in italian the phrase works that way! We are probably less precise about double negations, italians are compliant people. :)
OK, what about transition from one protein to another one, with a different function but within the same superfamily? You realise, don’t you, that (super)families are called (super)families because they are defined on the basis of homologies (in other words, shared ancestry, not shared functionalities, which may well be convergent).
Well, as I have shown in my previous post, different proteins of the same superfamily are in some cases already present in LUCA. In many other cases, obviously, they appear later. What about it? First of all, even if the function changes, very often the structure remains very similar. For example, shifts in function can happen with small variation at the active site. Consider the example of nylonase. It is derived form penicillinase by a couple (if I remember well) of substitutions at the active site. Both proteins are esterases, and they share the same structure and sequence. But the substrate is a different kind of ester (nylon versus penicillin), and the higher level function is therefore very different. But the local biochemical function is quite similar. Such transitions can be darwinian, because they are simple enough. In other cases even the shift inside families is too complex to be easily explained in darwinisn terms. Axe has debated this aspect, and he suggest about 5 AAs transitions as a threshold, if I remember well. I can agree with that concept. So, each functional transition must be judged individually, according to its functional complexity. That's exactly what dFSCI is meant to do. Well, I suppose that the discussion about how the designer could act on the material plane will have to wait until later.gpuccio
May 10, 2014
May
05
May
10
10
2014
12:15 AM
12
12
15
AM
PDT
Piotr:
No. The ancestors of their building blocks were present in LUCA.
No. You are wrong again. Most of the superfamilies are single domain. Just look again at the beta subunit of the ATP synthase. The superfamily is (in the NCBI classification): "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." For clarity, the name for the same superfamily in SCOP is P-loop containing nucleoside triphosphate hydrolases. IOWs, the superfamily is a superfamily which groups many different domains, with similar structure. That's why we have 6000 -7000 independent groupings at sequence level, and only 2000 superfamilies. What I have blasted is a specific protein in the superfamily, the beta subunit of ATP synthase, of which that particular domain (and therefore that superfamily) is the main component, the central domain (I have also specified which AAs are of the domain: approximately AAs 74 – 345). So, I have blasted the same protein as it is found in archaea and bacteria, not a protein with other members of the same superfamily, and I have found a very high homology. That means simply that the protein, that specific protein (and therefore the superfamily in which its main domain is classified) was already present in LUCA, practically the same as it is in archaea, in bacteria, and even in humans today. On the contrary, the homology between members of the same superfamily in the same species can be much lower. For example, if we BLAST the beta subunit of ATP synthase and RecA (the protein which gives the name to the superfamily in the NCBI classification) we find only three short alignments, none of them significant: 1) Identities: 7/18(39%) Positives: 10/18(55%) Expect: 1.5 2) Identities: 6/19(32%) Positives: 11/19(57%) Expect: 9.1 3) Identities: 4/8(50%) Positives: 8/8(100%) Expect: 10.0 What does it mean? It means that those two proteins, even if grouped in the same superfamily for similarities of structure and function, are completely different at the sequence level. Indeed, in the SCOP classification they belong to the same superfamily, P-loop containing nucleoside triphosphate hydrolases, and even to the same family, RecA protein-like (ATPase-domain). But they are completely different at sequence level. The grouping in superfamilies and families is based on structural and large functional similarities, and it can include proteins completely different at sequence level. Like in this case. What about Rec A? Is it conserved between archaea and bacteria? Was it already in LUCA, as an individual protein? The answer is yes. Here is the BLAST of Rec A in coli and in one archea (comnpare it with the BLAST I gave for ATP synthase subunit beta in post #187): Protein Rec A. E. coli: length 353 AAs. Euryarchaeote SCGC AAA261-G15: length 346 AAs Identities: 203/322(63%) Positives: 269/322(83%) Expect: 2e-151 Therefore, the two proteins are almost identical in bacteria and archea. And this particular protein is single domain: the whole protein is made of the domain P-loop containing nucleoside triphosphate hydrolases, which in the beta subunit of ATP synthase represents only the central (and main) domain.. To sum up: a) The superfamily P-loop containing nucleoside triphosphate hydrolases is presnt in cacteria and archea with at least two (indeed, with many more) different proteins which are calssified in it: The beta subunit of ATP synthase (central domain) and Rec A. b) Each of those two protein is almost the same at sequence level in archaea and bacteria, therefore it was present in LUCA mainly as it is now . c) The two proteins are completely different at sequence level in the same organisms (for example, in E. coli). d) Therefore, it is clear that not only the superfamily was already present in LUCA, but that it was already diversified at least in these two different proteins, always in LUCA. Which is exactly the opposite of what you were saying. Excuse me if I have been a little fastidious on this subject, but it is a fundamental point, and you were clearly understanding it in the wrong way.gpuccio
May 10, 2014
May
05
May
10
10
2014
12:02 AM
12
12
02
AM
PDT
But note in the post 212 that there's no mention of the origin of all those elaborate mechanisms. Just looking at the currently existing state of affairs is sufficient to excite some of us and keep us busy (scratching our heads), trying to understand it well. The OOL discussion, with all the references to statistics and probabilities, is too difficult, and sometimes very abstract. It's easier to stick to basic simplicity. Have fun!Dionisio
May 9, 2014
May
05
May
9
09
2014
08:16 PM
8
08
16
PM
PDT
Want to see abundant examples of complex specified purpose-oriented functional information processing mechanisms in action? Just look at the elaborate mechanisms that produce and operate the spindle apparatus that plays such an important role in the intrinsic asymmetric divisions during embryonic development. Also look at the sophisticated choreographies and orchestrations that produce and operate the genotype-phenotype association. Pay attention at the details that lead to relatively consistent proportions of the different cell types. Observe carefully the timing for the start of the different developmental phases. Enjoy the increasing availability of data coming out of the research labs, which shed more light on all those wonderful systems. That's why so many engineers and computer scientists are fascinated by all this. These are exciting days to look at science.Dionisio
May 9, 2014
May
05
May
9
09
2014
07:47 PM
7
07
47
PM
PDT
Resources for finding globular domains with determined tertiary structures include SCOP (fold leve), ASTRAL (domains from SCOP), SUPERFAMILY (hidden Markov models of SCOP domains), and CATH (architecture level). However, these resources are confined to the structural knowledge base PDB. Therefore, only a subset of the total fold and structure space is described; the remaining domains are to a certain extent described in Pfam and SMART. - Modular Protein Domains
Mung
May 9, 2014
May
05
May
9
09
2014
06:55 PM
6
06
55
PM
PDT
“How it happens” is a ridiculous question only to people who have no clue how to answer it and can’t think of a better evasive tactic.
It is a ridiculous question. The only people who ask it are ones trying to evade the obvious.jerry
May 9, 2014
May
05
May
9
09
2014
06:29 PM
6
06
29
PM
PDT
In my email today: What is Verification by Multiplicity? » Learn about the statistical technique that NASA scientists used to winnow out so many planets from the bulk data.Mung
May 9, 2014
May
05
May
9
09
2014
05:32 PM
5
05
32
PM
PDT
jerry:
The next silly objection will probably be the theodicy argument.
"How it happens" is a ridiculous question only to people who have no clue how to answer it and can't think of a better evasive tactic.Piotr
May 9, 2014
May
05
May
9
09
2014
04:38 PM
4
04
38
PM
PDT
If you look at Table 1, you will see that 1984 domains (more than half) were present in LUCA, that means about half of the superfamilies or families.
No. The ancestors of their building blocks were present in LUCA. What you are saying is like claiming that the variation seen in today's mammals and, parallelly, in today's birds, must have been present in their common ancestor 300+ million years ago. A penguin's flippers are homologous to a sparrow's wings, a dolphin's flippers are homologous to a bat's wings, and avian forelimbs in general are homologous to mammalian forelimbs. Therefore, the common ancestor must have had both wings and flippers (homologous to each other). Or perhaps, since there are ca. 6,000 species of mammals and 10,000 species of birds, the common ancestor was a collective of a few thousand rather different species. More tomorow. Buona notte!Piotr
May 9, 2014
May
05
May
9
09
2014
04:29 PM
4
04
29
PM
PDT
What method does he use to inject his functional information into the genomes of germline cells of selected individuals?
Oh, how did the designer do it argument! From a few years ago since this comes up all the time https://uncommondescent.com/intelligent-design/complex-specified-information-you-be-the-judge/#comment-305339
Someone actually wants the laboratory techniques used 3.8 billion years ago. You talk about bizarre. I say a thousand as hyperbole and Mark in all seriousness says there is probably only a dozen. Mark wants the actual technique used a few billion years ago. Mark, I got word from the designer a few weeks ago and he said the original lab and blue prints were subducted under what was to become the African plate 3.4 billion years ago but by then they were mostly rubble anyway. The original cells were relatively simple but still very complex. Subsequent plants/labs went the same way and unfortunately all holograph videos of it are now in hyper space and haven’t been looked at for at least 3 million years. So to answer one of your questions, no further work has been done for quite awhile and the designer expects future work to be done by the latest design itself. The designer travels via hyper space between his home and our area of the universe when it is necessary. The designer said the techniques used were much more sophisticated than anything dreamed of by current synthetic biologist crowd but in a couple million years they may get up to speed and understand how it was actually done. The designer said it is actually a lot more difficult than people think especially since this was a new technique and he had to invent the DNA/RNA/protein process from scratch but amazingly they had the right chemical properties. His comment was “Thank God for that” or else he doesn’t think he wouldn’t have been able to do it. It took him about 200,000 of our years just experimenting with amino acid combinations to get usable proteins. He said it will be easier for current scientists since they will have a template to work off.
As far as it being boring:
And he’s been at it for almost four billion years (the first two billion must have been pretty boring!)
Actually with a 100 billion planets in this universe and and an untold number of universes for which the designer is responsible, it gets pretty busy and never boring. There is a competition going on between designers on who can create the most interesting universe. With an infinite number of designers at it, the competition is pretty intense. The next silly objection will probably be the theodicy argument.jerry
May 9, 2014
May
05
May
9
09
2014
04:29 PM
4
04
29
PM
PDT
Well, there is even less empirical evidence that LUCA had predecessors. Can you deny that? Yes. I don't think there is any empirical evidence either way. Requiring such is to play a silly game, and pretending you can conclude anything after assuming away precursors LUCA would be too.wd400
May 9, 2014
May
05
May
9
09
2014
04:24 PM
4
04
24
PM
PDT
“It’s not a case that we have not even a single example (IOWs no example at all) of the transition from one protein superfamily to another one, from one structure and function to another completely different structure and function.” And I still mean it.
Hey, so you still mean the double negation? I'm beginning to suspect a Freudian slip: you realise you are wrong and you're unconsciously trying to contradict yourself. Be careful, or I'll take you at your word. :) OK, what about transition from one protein to another one, with a different function but within the same superfamily? You realise, don't you, that (super)families are called (super)families because they are defined on the basis of homologies (in other words, shared ancestry, not shared functionalities, which may well be convergent).Piotr
May 9, 2014
May
05
May
9
09
2014
04:10 PM
4
04
10
PM
PDT
wd400:
None of that is empirical evidence that the LUCA had no predecessors. No scientist thinks LUCA arose at once, so to simply assume away precursors to LUCA is hardly fairly dealing scientific claims.
Well, there is even less empirical evidence that LUCA had predecessors. Can you deny that? I am well aware that most scientists (I would not be sure of the "no scientist") thinks that. You may believe that what scientists believe is the standard for what is good science. I think differently. You may have noticed that I, like many others in ID, believe that scientific academy has been very biased about a couple of things, in the last decades.gpuccio
May 9, 2014
May
05
May
9
09
2014
04:07 PM
4
04
07
PM
PDT
Piotr:
He not only “designs” but also implements his “design” physically. What method does he use to inject his functional information into the genomes of germline cells of selected individuals? Note that he must be doing it all the time, to all species, not just to cause an occasional evolutionary breakthrough, but every time a dubiously functional orphan gene emerges in a bloody fruit fly. And he’s been at it for almost four billion years (the first two billion must have been pretty boring!). I won’t even try to enquire why he should be playing such a complex game, but I’m interested in the physical aspect of this “input of functional information”. In all cases known to me the transfer of information absolutely requires a physical signal. Can we detect it somehow? Or it pure spoonbending psychokinetic magic after all?
Very good question. I have debated that aspect many times in the past, but in this thread it is the first time it is mentioned. OK. I am ready to deal with that again. But it's late. Let's leave it for tomorrow. Have a good night! :)gpuccio
May 9, 2014
May
05
May
9
09
2014
04:03 PM
4
04
03
PM
PDT
Piotr:
Please show me you calculations regarding the time necessary for cellularity to evolve. 300 million years is not enough? Why? Some bacteria may produce 100 generations a day in favourable conditions. Protobiotic replication cycles are likely to have been still shorter. How many trillions or quadrillions of cycles would satisfy you?
At this point, you should already know what I think. Without design, cellularity cannot ever evolve. Period. But my point was, even if we believed it can, 300 million years is a joke. You are entitled to believe differently.gpuccio
May 9, 2014
May
05
May
9
09
2014
04:00 PM
4
04
00
PM
PDT
Piotr at #195:
Nope. The article argues that a large number of domains making up complex proteins can be traced back to common ancestors in LUCA. Domains are not protein superfamilies.
It's the same thing. They focus on the domain level of classification, and analyze 3464 domains, which corresponds to the 2000 superfamilies. The number of single domains os higher, because many superfamilies are multi-domain (class e in SCOP). If you look at Table 1, you will see that 1984 domains (more than half) were present in LUCA, that means about half of the superfamilies or families. That should be very clear form the part I have quoted, and I quote it again here for your convenience: “Table 1 lists the predicted number of domains and domain combinations originated in the major lineages of the tree of life. 1984 domains (at the family level) are predicted to be in the root of the tree (with the ratio Rhgt = 12), accounting for more than half of the total domains (3464 families in SCOP 1.73).” What is your problem?
What a pity you read good articles so selectively. How about this?
Nothing. It is true that the paper analyzes also the emergence of new domain combinations. And it is true that domain combinations tend to increase in their emergence in the course of natural history. But I focused on the emergence of simple domains, which also continue to emerge throughout natural history, although at a decreasing rate. Table 1 shows both the emergence of single domains (column 2) and of combinations (column 3). Why do I focus on the single domains? It's simple. As I am discussing the computation of digital functional complexity, I am interested more in sequences, and single domains represent new sequences. Combinations are certainly interesting and important, but more difficult to analyze. We should consider the space of possible combinations, and of functional ones. It can certainly be done, but at this stage I prefer to stick to individual domains. It's more than enough for my purpose. I don't think that I read articles selectively. I read and quote what is pertinent and important for my argument. The data about combination are absolutely compatible with my argument, and I don't see how they can favor yours, but they were not pertinent to the discourse I was doing about sequence functionality. In recombinations, existing sequences are reused in new forms. There is functional novelty, but it is not so much in the sequences, but rather in how they are rearranged and combined.gpuccio
May 9, 2014
May
05
May
9
09
2014
03:57 PM
3
03
57
PM
PDT
We are again at silly arguments. The designer desings. It is no magic, small scale or big scale. It is the input of functional information into objects.
He not only "designs" but also implements his "design" physically. What method does he use to inject his functional information into the genomes of germline cells of selected individuals? Note that he must be doing it all the time, to all species, not just to cause an occasional evolutionary breakthrough, but every time a dubiously functional orphan gene emerges in a bloody fruit fly. And he's been at it for almost four billion years (the first two billion must have been pretty boring!). I won't even try to enquire why he should be playing such a complex game, but I'm interested in the physical aspect of this "input of functional information". In all cases known to me the transfer of information absolutely requires a physical signal. Can we detect it somehow? Or it pure spoonbending psychokinetic magic after all?Piotr
May 9, 2014
May
05
May
9
09
2014
03:52 PM
3
03
52
PM
PDT
None of that is empirical evidence that the LUCA had no predecessors. No scientist thinks LUCA arose at once, so to simply assume away precursors to LUCA is hardly fairly dealing scientific claims.wd400
May 9, 2014
May
05
May
9
09
2014
03:43 PM
3
03
43
PM
PDT
Piotr at #186:
For simplicity, let’s restrict our discussion to a gene whose paralogue is initially identical to the original and remains “functional” (that is, gets transcribed and translated into a functional product).
OK.
I’d like to concentrate on those cases when duplication is practically neutral.
OK.
??? — I do not understand this paragraph as it stands. Did you mean “a gene which is functional”? And what exactly is the second sentence trying to convey? Why “negative selection”? Under the standard meaning of the term, it isn’t something that affects “really useful” genes. I can only guess that you are indirectly (and confusingly) referring to “Ohno’s dilemma”: if the retention of both copies is beneficial, it restricts their ability to diverge. If that’s what you mean, I agree.
My typo. It was obviously "a gene which is functional". And negative selection is "something that affects “really useful” genes", when they mutate and lose their function. From Wikipedia: "In natural selection, negative selection[1] or purifying selection is the selective removal of alleles that are deleterious. This can result in stabilizing selection through the purging of deleterious variations that arise." Negative selection is the same as purifying selection. It's the reason why functional sequences cannot change much and are conserved, while non functional sequences can vary more. So yes, we agree on that.
Nope. You have smuggled in a false tacit assumption (after Behe?): gene = protein = function. It’s actually gene -> protein(s) and protein -> {function_1, function_2, … function_n}. The consequences of a mutation may damage one of the functions while leaving others unaffected. This opens many interesting possibilities.
This deserves more discussion. It is true that a protein can have many functions. There are many reasons for that. One of them is that very complex proteins have often different domains, and each domain has different biochemical functions. Another reason is that the basic biochemical function (the "local" function) can have different roles in a higher level context. However, there is no doubt that proteins are first of all biochemical machines. The individual domains have a specific biochemical activity, which depends on the sequence and the structure. If that biochemical activity is lost or damaged, any higher function is lost or damaged. Moreover, many proteins, like enzymes, have a rather straightforward biochemical function, which can be easily identified. I can't see which "interesting possibilities" you see, or imagine, from that. If a duplicated protein gene is neutral, as you assumed, then its sequence can change freely. It is not different from a non coding, non functional gene. Even if it retains other functions, for example in different domains, how can that help you? The functional parts will be conserved, the non functional parts will change inexorably and randomly.
This is a myth.
No. It is a fact. Let's look again at the SCOP calssification. There is a tool, called ASTRAL, which allows us to ask for the grouping of all the domains in the database according to a "lower than n%" identity, or "greater than n" E value. I have done that as follows: Subsets with lower than 10% identity: 6400 Subsets with greater than 0.05 E value: 7181 What does that mean? It means that, if we group the protein records in the database so that each group has less than 10% identity with all other groups, IOWs an identity which can be expected by chance (E value higher tha, 0.05), than we get 6000 - 7000 separated groupings (that corresponds more or less to the number of families, which is 4496). Those are isolated islands of function in the sequence space. It is a fact. Even adding structural and functional similarities to group in larger subsets, we still get the 2000 superfamilies. I usually use the superfamily level of classification in my reasoning because of my old bad habit of being too generous with the enemy! :)
When interpreted literally, it seems to mean that have examples of such transitions. Which is of course just fine as far as I’m concerned, but I suspect that the sentence has mutated via double negation into the opposite of what youy had intended to say.
You suspect correctly. A slip of my English, I suppose. What I meant was: "It’s not a case that we have not even a single example (IOWs no example at all) of the transition from one protein superfamily to another one, from one structure and function to another completely different structure and function." And I still mean it. The moral? Never mess with a linguist! :)gpuccio
May 9, 2014
May
05
May
9
09
2014
03:33 PM
3
03
33
PM
PDT
Gpuccio:
Remember that for a long time the planet was not compatible with any form of life. But I don’t want to quarrel for a few hundred million years! The “first chemical replicators” would not have had the time “to evolve into prokaryotic life more or less as we know it” even if our planet were tens of thousand years old.
Please show me you calculations regarding the time necessary for cellularity to evolve. 300 million years is not enough? Why? Some bacteria may produce 100 generations a day in favourable conditions. Protobiotic replication cycles are likely to have been still shorter. How many trillions or quadrillions of cycles would satisfy you?Piotr
May 9, 2014
May
05
May
9
09
2014
03:25 PM
3
03
25
PM
PDT
The 900 – 1000 superfamilies which were already in LUCA were already in LUCA. That nmeans that the proteins which are part of those superfamilies share high homology, and similar structure and function, not only among themselves in in species, but also between archea and bacteria. That’s how we know that they were already present in LUCA.
Nope. The article argues that a large number of domains making up complex proteins can be traced back to common ancestors in LUCA. Domains are not protein superfamilies. A protein found in Bacteria may be homologous to a protein found in Archaea. Even better, the same bacterial protein may have numerous paralogues in the same bacterial genome, a vast number of homologues in other Bacteria and Archaea, etc. More often it will be a conserved domain rather than a complete complex protein. If we can rule out horizontal transfer, we conclude that we are dealing with a family reducible to a single common ancestor present in LUCA. Over the next millions and billions of years LUCA's proteins (or at least their domains, combined and recombined in the course of evolution) have diverged into today's superfamilies. What a pity you read good articles so selectively. How about this?
This combined evolution of domains, and combinations thereof, suggests that once protein domains have been generated and inherited in genomes, biological organisms tend to create new proteins and functions through duplication and recombination of existing domains, rather than create new domains de novo, in accordance with the general trend of genome evolution by means of duplication and recombination.
Piotr
May 9, 2014
May
05
May
9
09
2014
03:12 PM
3
03
12
PM
PDT
wd400 at #191: Yes. We know that LUCA was a prokaryote. The simplest autonomous living beings ever observed are prokaryotes. There is no evidence at all that autonomous living being simpler than prokaryotes even exist. LUCA was there in a window of time from the origin of our planet and from when it became compatible with life that, while not known with certainty, is certainly rather "narrow" in terms of natural history of our planet. The information gap between non living matter and prokaryotes is certainly much bigger than all other information gaps in the history of life, including the generation of eukaryotes and the Cambrian explosion. There is no credible theory for that rather sudden emergence of such a huge quantity of functional information, out of design. There is no evidence at all that some FUCA different from LUCA, and simpler, ever existed. Therefore, it is perfectly legitimate, and reasonable, to hypothesize that what we call LUCA, a prokaryote predating the archaea - bacteria divergence, was the first form of life on our planet (FUCA).gpuccio
May 9, 2014
May
05
May
9
09
2014
02:57 PM
2
02
57
PM
PDT
Piotr:
You seem to think that for a few billion years the “designer” has not really been creating anything out of thin air. He’s only been using small-scale magic to add some intelligent organisation to elements already present, like when he turns small ORFs into genes in fruit flies. He must have a soft spot for fruit flies; he’s still churning out orphan genes by the hundred for the 1500 Drosphila species, at such a rate that many of them are still segregating in D. melanogaster, for example. Perhaps the designer is the Lord of the Flies?
We are again at silly arguments. The designer desings. It is no magic, small scale or big scale. It is the input of functional information into objects. And why do you hate so much fruit flies? Each living being is marvelous.
But when we get to the root of life, you have to assume the magical creation of a complete LUCA out of thin air. The designer even had to create “superfamilies of genes” looking as if they consisted of homologues (but they can’t be true homologues if LUCA was Generation Zero; the designer simply made them look related).
Why magics again? And the thin air? Maybe the designer used his own dust, to quote a famous joke. And yes, he had to engineer many superfamilies of genes at the beginning, because they were necessary to start life. The final statement is really pointless. We obviously don't know in what times and what steps the first prokaryotes were engineered. That is true both for a design theory and for a non design theory. However, when you design a software, you can well implement basic objects and then derive similar functional objects from them. It's what happens routinely in human Object Oriented Programming. And it is possible that the first living beings came into existence only when the global plan was implemented. We don't know. Maybe one day we will.gpuccio
May 9, 2014
May
05
May
9
09
2014
02:47 PM
2
02
47
PM
PDT
Piotr:
Some of those fossil traces of life (if their interpretation is correct may well be older than LUCA.
How do you know that they are not of LUCA, that they are different from it?
I wouldn’t say that the 3.5 Ga estimate of the age of LUCA is impossible (though I’ve seen more modest estimates, of the order of 2.9 Ga, in recent literature). It still gives us a few hundred million years for the first chemical replicators to evolve into prokaryotic life more or less as we know it. It’s an interval roughly equal to that between the end of the Carboniferous and now.
Remember that for a long time the planet was not compatible with any form of life. But I don't want to quarrel for a few hundred million years! The "first chemical replicators" would not have had the time "to evolve into prokaryotic life more or less as we know it" even if our planet were tens of thousand years old. :)gpuccio
May 9, 2014
May
05
May
9
09
2014
02:36 PM
2
02
36
PM
PDT
“The simple point is that the facts we observe suggest that LUCA is also FUCA</i? Can you show any empirical evidence to support this conclusion? It seems a bizarre thing to claim.wd400
May 9, 2014
May
05
May
9
09
2014
02:34 PM
2
02
34
PM
PDT
Piotr: You say:
LUCA is nothing of the kind. It’s just the point in the past where all the genealogies of modern gene families finally coalesce. It was no more the first organism than “mtEve” was the first female. It wasn’t the first prokaryotic cellular organism either, or the only form of life on young Earth. If we ignore horizontal transfer, LUCA is technically the most recent common ancestor of Bacteria and Archaea, but since HT was surely widespread among early prokaryotes, the actual point of coalescence must be a bit deeper.
You make a seiries of statements that have no empirical support. They may be true or false. Nobody knows, not even you. I list them here for you: a) "It was no more the first organism than “mtEve” was the first female." Comment: Can you show any empirical evidence for other organisms before LUCA and different from it? b) "It wasn’t the first prokaryotic cellular organism either," Comment: Can you show any empirical evidence for other prokaryotic cellular organisms before LUCA and different from it? c) "or the only form of life on young Earth." Comment: Can you show any empirical evidence for other forms of life on young Earth? OK, I am waiting. My initial statement was, in comparison, much humbler: "The simple point is that the facts we observe suggest that LUCA is also FUCA. Ideology, and only ideology, suggests otherwise."gpuccio
May 9, 2014
May
05
May
9
09
2014
02:30 PM
2
02
30
PM
PDT
Piotr: You quote: "“However, even though there may be conservation of very general functional or molecular mechanisms (see Todd et al. 2001), some ancestral superfamily domains show such high functional diversification that it is difficult to define a concrete function for them. For example, the ATP-loop superfamily has representatives in 230 different orthologue clusters divided into 19 different COG functional subcategories. Although the ATP-loop domain is mainly represented in metabolic pathways, this domain is also involved in disparate functional roles.”" And so? That just means that some superfamilies include a lot of different families and proteins, with diversification of function. And so? How does that help your reasoning? The superfamily is a very high order grouping, second only to folding. It is normal that some superfamilies include diverse families and proteins, while others are more homogeneous. And so?gpuccio
May 9, 2014
May
05
May
9
09
2014
02:22 PM
2
02
22
PM
PDT
1 2 3 4 5 10

Leave a Reply