Uncommon Descent Serving The Intelligent Design Community

The highly engineered transition to vertebrates: an example of functional information analysis

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

kitten-1517537_1280

In the recent thread “That’s gotta hurt” Bill Cole states:

I think over the next few years 3 other origins (my note: together with OOL), will start to be recognized as equally hard to explain:

  • The origin of eukaryotic cell: difficult to explain the origin of the spliceosome, the nuclear pore complex and chromosome structure.
  • The origin of multicellular life: difficult to explain the origin of the ability to build complex body plans.
  • The origin of man: difficult to explain the origin of language and complex thought.

That thought is perfectly correct. There are, in natural history, a few fundamental transitions which scream design more that anything else. I want to be clear: I stick to my often expressed opinion that each single new complex protein is enough to infer design. But it is equally true that some crucial points in the devlopment of life on earth certainly stand out as major engineering events. So, let’s sum up a few of them:

  1. OOL
  2. The prokaryote – eukaryote transition (IOWs, eukaryogenesis)
  3. The origin of metazoa (multicellular life)
  4. The diversification of the basic phyla and body planes (IOWs, the Cambrian explosion)

Well, saurian-1358308_1280to those 4 examples, I would like to add the diversification of all major clades and subphyla.

Of course, another fundamental transition is the one to homo sapiens, but I will not deal with it here: I fully agree with Bill Cole that it is an amazing event under all points of view, but it is also true that it presents some very specific problems, which make it a little bit different from all the other transitions we have considered above.

I will state now in advance the point that I am trying to make here: each of the transitions described requires tons and tons of new, original, highly specific functional information. Therefore, each of those transitions commands an extremely strong inference to design. I will deal in particular with the transition to the subphylum of vertebrates, for a series of reasons: being vertebrates, we are naturally specially interested in that transition; there are a lot of fully sequenced genomes and proteomes of vertebrate species ;  and a lot is known about vertebrate biology. IOWs, we have a lot of data that can help us in our reasoning. So, I will  try to fix a few basic points which will be the foundation of our analysis:

  • a) The basic phylum is Chordates, which are characterized by the presence of a notochord. Chordates include three different clades: Craniata, Tunicata, Cephalochordata.
  • b) Vertebrates are a subphylum of the phylum Chordates, and in particular of the clade Craniata. They represent the vast majority of Chordates, with  about 64,000 species described. As the name suggests, they are characterized by the presence of a vertebral column, either cartilaginous or bony, which replaces the notochord.
  • c) The phylum Chordate, like other phyla, can be traced at least to the Cambrian explosion (540 million years ago).
  • d) Chordates which are not vertebrates are quite rare today. They include:
    • 1) Craniata: the only craniates which are not vertebrates are in the class Myxini (hagfish), whose classification however remains somewhat controversial. All other craniates are vertebrates.
    • 2) Tunicata (or urochordata): about 3000 species, the best known and studied is Ciona intestinalis.
    • 3) Cephalochordata: about 30 species of Lancelets.
  • e) The phyla most closely related to Chordates are Hemichordates (like the Acorn worm) and Echinoderms (Starfish, Sea urchins, Sea cucumbers).
  • f) Vertebrates can be divided into the following two groups:
    • 1) Fishes: 3 Classes:
      • 1a) Jawless  (lampreys)
      • 1b)  Cartilaginous (sharks, rays, chimaeras)
      • 1c) Bony fish
    • 2) Tetrapods: all the rest (frogs, snakes, birds, mammals)

For the following analysis, I will consider vertebrates versus everything which preceded them (all metazoa, including “pre-chordates” (Hemichordates and Echinoderms) and “early chordates”  (Tunicata and Cephalochordata). So, everything which is new in vertebrates had to appear in the window between early chordates and the first vertebrates: cartilaginous fish and bony fish (I will not refer to lampreys, because the data are rather scarce). So, let’s try to define the temporal window, for what it is possible:

  • Chordates are already present at the Cambrian explosion, 540 my ago.
  • Jawless fish appeared slightly later (about 530 my ago), but they are mostly extinct.
  • The split of jawless fish into cartilaginous fish and bony fish can be traced about at 450 my ago

Therefore, with all the caution that is required, we can say that the information which can be found in both cartilaginous fish and bony fish, but not in non vertebrates (including early chordates), must have been generated in a window of less that 100 my, say between 540 my ago and 450 my ago. Now, my point is very simple: we can safely state that in that window of less than 100 million years a lot of new complex functional information was generated. Really a lot. To begin our reasoning, we can say that vertebrates are characterized by the remarkable development of two major relational systems:

  1. The adaptive immune system, which appears for the first time exactly in vertebrates.
  2. The nervous system, which is obviously well represented in all metazoa, but certainly reaches new important adaptations in vertebrates.

Muperch-62855_640ch can be said about the adaptive immune system, and that will probably be the object of a future OP. For the moment, however, I will discuss some aspects linked to the development of the nervous system. The only point that is important here is that the nervous system of vertebrates undergoes many important modifications, especially a process of encephalization.  My interest is mainly in the developmental controls that are involved in the realization of the new body plans and structures linked to those processes. Of course, we don’t understand how those regulations are achieved. But today we know much about some molecules, especially regulatory proteins, which have an important role in the embryonal development of the vertebrate nervous system, and in particular in the development and migration of neurons, which is obviously the foundation for the achievement of the final structure and function of the nervous system. So, I will link here a recent paper which deals with some important knowledge about the process of neuron migration. I invite all those interested to read it carefully: Sticky situations: recent advances in control of cell adhesion during neuronal migration by David J. Solecki Here is the abstract:

The migration of neurons along glial fibers from a germinal zone (GZ) to their final laminar positions is essential for morphogenesis of the developing brain, aberrations in this process are linked to profound neurodevelopmental and cognitive disorders. During this critical morphogenic movement, neurons must navigate complex migration paths, propelling their cell bodies through the dense cellular environment of the developing nervous system to their final destinations. It is not understood how neurons can successfully migrate along their glial guides through the myriad processes and cell bodies of neighboring neurons. Although much progress has been made in understanding the substrates (14), guidance mechanisms (57), cytoskeletal elements (810), and post-translational modifications (1113) required for neuronal migration, we have yet to elucidate how neurons regulate their cellular interactions and adhesive specificity to follow the appropriate migratory pathways. Here I will examine recent developments in our understanding of the mechanisms controlling neuronal cell adhesion and how these mechanisms interact with crucial neurodevelopmental events, such as GZ exit, migration pathway selection, multipolar-to-radial transition, and final lamination.

In brief, the author reviews what is known about the process of neuronal cell adhesion and migration. Starting from that paper and some other material, I have chosen a group of six regulatory proteins which seem to have an important role in the above process. They are rather long and complex proteins, particularly good for an information analysis. Here is the list. I give first the name of the protein, and then the length and accession number in Uniprot for the human protein:

  • Astrotactin 1,     1302 AAs,     O14525
  • Astrotactin 2,    1339 AAs,     O75129
  • BRNP1 (BMP/retinoic acid-inducible neural-specific protein 1),     761 AAs,     O60477
  • Cadherin 2 (CADH2),      906 AAs,    P19022
  • Integrin alpha-V,    1048 AAs,      P06756
  • Neural cell adhesion molecule 1 (NCAM1),   858 AAs,  P13591

This is a  very interesting bunch of molecules:

  • Astrotactin 1 and 2 are two partially related perforin-like proteins. ASTN-1 is a membrane protein which is directly responsible for the formation of neuron–glial fibre contacts. ASTN2 is not a neuron-glial adhesion molecule, but it functions in cerebellar granule neuron (CGN)-glial junction formation by forming a complex with ASTN1 to regulate ASTN1 cell surface recruitment. More about these very interesting proteins can be found in the following paper:

Structure of astrotactin-2: a conserved vertebrate-specific and perforin-like membrane protein involved in neuronal development by Tao Ni, Karl Harlos, and Robert Gilbert

  • BRNP1 is another  protein which functions in neural cell migration and guidance
  • Cadherin 2, or N-cadherin, is active in many neuronal funtions and in other tissues, and seems to have a crucial role in glial-guided migration of neurons
  • Integrin alpha-V, or Vitronectin receptor, is one of the 18 alpha subunits of integrins in mammals. Integrins are transmembrane receptors that are the bridges for cell-cell and cell-extracellular matrix (ECM) interactions.
  • NCAM1 is a cell adhesion molecule involved in neuron-neuron adhesion, neurite fasciculation, outgrowth of neurites

Now, why have I chosen these six proteins, and what do they have in common? They have two important things in common:

  • They are all big regulatory proteins, and they are all involved in a similar regulatory network which controls endocytosis, cell adhesion and cell migration in neurons, and therefore is in part responsible for the correct development of the vertebrate nervous system
  • All those six proteins present a very big informarion jump between pre-vertebrate organisms and the first vertebrates

The evolutionary history of those six protein is summarized in the following graph, realized as usual by computing the best homology bit score with the human protein in different groups of organisms.

Neuron_migration

Very briefly, all the six human molecules have low homology with pre-vertebrates, while they already show a very high homology  in cartilaginous fishes. The most striking example is probably Astrotactin 2, which presents the biggest jump from cephalochordata (329 bits) to cartilaginous fishes (1860 bits), for a great total of 1531 bits of jump! The range of individual jumps in the group is 745 – 1531 bits, with a mean jump of 1046 bits per molecule and a total jump of 6275 bits for all six molecules. The jump has always been computed as the difference between the best bit score in cartilaginous fishes and the best bitscore in all pre-vertebrate metazoa. We can also observe that the first three proteins have really low homology with everything up to tunicates, but show a definite increase in Cephalochordata, which precedes the big jump in cartilaginous fishes, while the other three molecules have a rather constant behaviour in all pre-vertebrate metazoa, with a few hundred bits of homology, before “jumping” up in sharks. One could ask: is that a common behaviour of all proteins? The answer is no. Look at the following graph, which shows the same evolutionary history for two other proteins, both of them very big regulatory proteins, both of them implied in the same processes as the previous six.

Neuron_migration2

Here, the behaviour is completely different. While there is a slight increase of homology in time, with a few smaller “jumps”, there is nothing comparable to the thousand bit jumps in the first six molecules. IOWs, these two molecules already show a very high level of homology to the human form in pre-vertebrates, and change only relatively little in vertebrates. We can say, therefore, that most of the functional information in these two proteins was already present before the transition to vertebrates.

So, to sum up:

  • a) The six proteins analyzed here all exhibit a huge informational jump between pre-vertebrates and vertebrates. The total functional informational novelty for just this small group of proteins is more than 6000 bits, with a mean of more than 1000 bits per protein.
  • b) These proteins are probably crucial agents in a much more complex regulation network implied in neuron adhesion, endocytosis, migration, and in the end in the vast developmental process which makes individual neurons migrate to their specific individual locations in the vertebrate body plan.
  • c) The above process is certainly much more complex than the six proteins we have considered, and implies other proteins and obviously many non coding elements. Our six proteins, therefore, can be considered as a tiny sample of the general complexity of the process, and of the informational novelty implied in the process itself.
  • d) Moreover, the process regulating neuron migration is certainly strictly integrated, with so many agents working in a coordinated way. Therefore, there is obviously a strong element of irreducible complexity implied in the whole informational novelty of the vertebrate process, an element that we can only barely envisage, because we still understand too little.
  • e) The neuron regulation process, of course, is only a part of the informational novelty implied in vertebrates, a small sample of a much more complex reality. For example, there is a lot of similar novelty implied in the workings of the immune system, of the cytokine signaling system, and so on.
  • f) The jump described here is really a jump: there is no trace of intermediate forms which can explain that jump in all existing pre-vertebrates. Of course, neo darwinists can always dream of lost intermediates in extinct species. This is a free world.
  • g) Are these 6000+ bits of functional information really functional? Yes, they are. Why? because they have been conserved for more than 400 million years. Remember, the transition we have considered happens between the first chordates and cartilaginous fish, and it can be traced to that range of time. And those 6000+ bits are bits of homology between cartilaginous fish and humans.
  • h) How much is 6000 bits of functional information? It is really a lot! Remember, Dembski’s Universal Probability Bound, taking in consideration the whole reasonable probabilistic resource of our whole universe from the Big Bang to now, is just 500 bits. 6000 bits correspond to a search space of 2^6000, IOWs about 10^2000, a number so big that we cannot even begin to visualize it. It’s good to remind ourselves, from time to time, that we are dealing with exponential values.
  • i) How great is the probability that 6000 bits of functional information can be generated in a window time of less than 100 million years, by some unguided process of RV + NS in six objects connected in an irreducibly complex system, even if RV were really helped by some NS in intermediates of which there is no trace? The answer is simple: practically non existent.
  • j) Therefore, the tiny sample of six proteins that we have considered here, which is only a small part of a much bigger scenario, points with extreme strength to a definite design inference:

The transition to vertebrates was a highly engineered process. The necessary functional information was added by design.

 

Comments
bill: I just found this statement you made at TSZ:
The papers that I have read give two proposals on de novo genes. 1. Gene duplication 2. NC RNA. In one paper there is a vague proposal on once a gene is duplicated and finds new function, how it then gets transcribed. The real challenge is finding new function through almost infinite mathematical space. First a mechanism of change needs to be identified. Then a mathematical model that can repeatably get you from functional space A to functional space B which is the new gene as a result of gene duplication. The additional challenge here is natural selection does not help until function is found and successfully transcribed. While Lenski’s experiment showed through gene duplication and transcription of a duplicated gene how a new feature could evolve (the ability to consume citrate in an aerobic condition) how did the original 480AA enzyme sequence that breaks down the citrate molecule evolve?
You said it perfectly! I wholly agree with you. :) I must say that the way our interlocutors arrogantly exhibit their few examples of microevolution, involving one or two simple events (including Lenski's), as though they were splendid answers to the information problem frankly nears intellectual dishonesty. And about Lenski and the meaning of the Cit+ phenotype, it can be useful to read again this simple summary from Wikipedia:
Other researchers, have experimented on evolving aerobic citrate-utilizing E. coli. Dustin Van Hofwegen et al., working in the lab of Scott Minnich, were able to isolate 46 independent citrate-utilizing mutants of E. coli in just 12 to 100 generations using highly prolonged selection under starvation, during which the bacteria would sample more mutations more rapidly.[43] In their research, the genomic DNA sequencing revealed an amplification of the citT and dctA loci and rearrangement of DNA were are the same class of mutations identified in the experiment by Richard Lenski and his team. They concluded that the rarity of the citrate-utilizing mutant in Lenski's research was likely a result of the selective experimental conditions used by his team rather than being a unique evolutionary speciation event.[43] John Roth and Sophie Maisnier-Patin reviewed the approaches in both the Lenski team's delayed mutations and the Van Hofweges team's rapid mutations on E. coli. They argue that both teams experienced the same sequence of potentiation, actualization, and refinement leading up to similar Cit+ variants.[44] According to them, the period of less than a day during which citrate usage would be under selection, followed by 100-fold dilution, and a period of growth on glucose that would not select for citrate use, ultimately lowered the probability of E. coli being able accumulate early adaptive mutations from one period of selection to the next.[44] On the other hand, Van Hofwegen's team allowed for a continuous selection period of 7 days, which yielded a more rapid development of citrate-using E. coli. Roth and Maisnier-Patin suggest that the serial dilution of E. coli and short period of selection for citrate-use under the conditions of the LTEE perpetually impeded each generation of E. coli from reaching the next stages of aerobic citrate utilization.[44] In response, Blount and Lenski acknowledge that the problem is not with the experiments or the data, but with the interpretations made by Van Hofwegen et al. and Maisnier-Patin and Roth.[45] Lenski notes that the rapid evolution of Cit+ was not necessarily unexpected since his team was also able to produce multiple Cit+ mutants in a few weeks during the replay experiments they reported in the 2008 paper in which his team first described the evolution of aerobic citrate use in the LTEE.[46] Furthermore, Lenski criticizes Van Hofwegen et al.'s description of the initial evolution of Cit+ as a "speciation event" by pointing out that the LTEE was not designed to isolate citrate-using mutants or to deal with speciation since in their 2008 paper they said "that becoming Cit+ was only a first step on the road to possible speciation", and thus did not propose that the Cit+ mutants were a different species, but that speciation might be an eventual consequence of the trait's evolution.[46] Lenski acknowledges that scientists, including him and his team, often use short hand and jargon when discussing speciation, instead of writing more carefully and precisely on the matter, and this can cause issues.[46] However, he notes that speciation is generally considered by evolutionary biologists to be a process, and not an event.[46] He also criticizes Van Hofwegen et al. and Roth and Maisnier-Patin for positing "false dichotomies" regarding the complex concept of historical contingency. He argues that historical contingency means that history matters, and that their 2008 paper presented data that showed that the evolution of Cit+ in the LTEE was contingent upon mutations that had accumulated earlier. He concludes that "...historical contingency was invoked and demonstrated in a specific context, namely that of the emergence of Cit+ in the LTEE—it does not mean that the emergence of Cit+ is historically contingent in other experimental contexts, nor for that matter that other changes in the LTEE are historically contingent—in fact, some other evolved changes in the LTEE have been highly predictable and not (or at least not obviously) contingent on prior mutations in the populations."[46]
gpuccio
July 30, 2016
July
07
Jul
30
30
2016
01:15 AM
1
01
15
AM
PDT
Patrick at TSZ goes on inviting me to comment at their blog, instead of continuing with the interblog issue. While I thank him for the kind interest, I must remind him that I post here, because here is my place, here are the people who care about ID, which the theory I love, and so here I must express my ideas. Just answering some of the comments from TSZ is big work for me, and as you can see from my little misunderstanding with bill cole at #202, it is really done in a rush most of the time. But, at least, the discussion happens here, where those who come at UD can see it, if they like. Posting at TSZ, which I have done in the past, would take too much of my time and subtract it to my activity here. So, Patrick, just accept my choice. After all, you are not obliged to be interested in my thoughts, or if you are, you can always read them here (I suppose you are not banned from reading) and answer at TSZ, and if what you say is interesting, and if I have the time, I will answer here. It's cumbersome, but it's he best I can do.gpuccio
July 30, 2016
July
07
Jul
30
30
2016
12:46 AM
12
12
46
AM
PDT
bill: I just saw that at TSZ: From Rumracket: "The first number is 1 in every 10^11 randomly generated protein sequences 80 amino acids in length, will have [The Function In Question]. It’s based on this paper: Functional proteins from a random-sequence library by Keefe & Szostak." So, I see they are still using that paper for their propaganda. They are shameless! "The second number is a creationist number, IIRC spuriously calculated by Douglas Axe back in some work he did in 2004." So, the completely biased paper by Keefe & Szostak, which shamelessly uses intelligent selection to get some folding about a completely non selectable function, is great, while Axe's work is "spurious". Why don't we add the 10^70 starting sequences computed to be necessary to get the wildtype sequence in the rugged landscape paper? The way neo darwinists try to convince us that their fantasies about a function filled protein space have any objective support range from pitiful to dishonest. By the way, have you noticed that the total functional information content of the beta chain of ATP synthase as conserved from E. coli to humans (about 6oo bits), while huge and amazing, is still much lower than the mean functional information jump of the six proteins in my OP? Ah, but certainly neo darwinists do not want to deal with those numbers! Coming from a crypto-creationist, they must certainly be "spurious", at best! :) So, let them stick to their lies about Keefe & Szostak's paper.gpuccio
July 30, 2016
July
07
Jul
30
30
2016
12:35 AM
12
12
35
AM
PDT
bill: Thank you for clarifying. I apologize for misunderstanding the roles. Indeed, I had suspected that colewd could be you, but frankly I had not the time to follow well the whole discussion there. I suppose that my points at @202 remain valid as an answer to Alan Fox. :)gpuccio
July 29, 2016
July
07
Jul
29
29
2016
10:59 PM
10
10
59
PM
PDT
Gpuccio The question came from Alan Fox to Colewd. Since it referenced you I thought I would bring it to you since I am colewd :-) I will edit colewd and put Alan in and respond. Thanksbill cole
July 29, 2016
July
07
Jul
29
29
2016
02:46 PM
2
02
46
PM
PDT
bill: I am not sure that I understand well colewd's point, and now I have not the time to look at all his posts at TSZ (which seem to be many!). I talk about superfamilies of related functional proteins because that's what we observe. It's not me that classify proteins in superfamilies. It's SCOP and other important databases. What I don't understand is what he means by "putative proteins that have not been already exploited by some organism". For example, we have ATP synthase, and the alpha and beta chains, as I have debated many times, are very specific and extremely conserved, from prokaryotes to humans. So, it is perfectly reasonable to consider those two sequences as islands of function. Maybe colewd thinks that there are a lot of completely different sequences, with a near constant distribution across sequence space (whatever that means), which could easily work in ATP synthase in the place of the alpha and beta chain. But why should that be true? Has he any reason to believe such a weird thing, beyond simple imagination? The movements in sequence space are realized, as should be obvious, at sequence level. If two sequences have no detectable homology, what is the probability of getting from one to the other by simple random variation? Practically non existent. So, the only sequence space where you could find another beta chain which works, and is not related to our well known beta chain, is a sequence space which is practically filled by different sequences which would all works as beta chains of ATP synthase. Such a space exists, I am afraid, only in the fervid imagination of those who cannot admit the simple truth taught by observed facts: protein superfamilies are isolated islands of function in the sequence space, and there are 2000+ of them in the known proteome. If I am misunderstanding colewd's point, I am ready to listen to his clarifications. Just let me know.gpuccio
July 29, 2016
July
07
Jul
29
29
2016
02:35 PM
2
02
35
PM
PDT
Gpuccio Here is a question to me regarding you from Alan Fox at TSZ. I wanted your thoughts before I responded.
@ colewd. This is the crux of arguments over functionality. Gpuccio talks about superfamilies of related functional proteins as if the distribution of functional proteins varies between proteins that are known and putative proteins that have not been already exploited by some organism. Why should the density vary across sequence space? Why should not the occurrence of functional proteins turn out to be near constant across the whole space of all possible protein sequences?
bill cole
July 29, 2016
July
07
Jul
29
29
2016
12:43 PM
12
12
43
PM
PDT
Positive selection: PPO ¦ PP POP ¦ P P PPO ¦ PP POO ¦ P POO ¦ P Negative selection: NOOON ¦ OOO NNOON ¦ OO NONON ¦ O O NOONN ¦ OO NOOON ¦ OOO :)Dionisio
July 29, 2016
July
07
Jul
29
29
2016
12:02 PM
12
12
02
PM
PDT
gpuccio @197 Anyway, the indicated 'mutations' did not affect the functional information in the associated statements. :) But definitely they were not designed. :)Dionisio
July 29, 2016
July
07
Jul
29
29
2016
11:33 AM
11
11
33
AM
PDT
gpuccio @193
One more reason not to try any design inference here.
Here you've demonstrated honesty by observing the established rules.Dionisio
July 29, 2016
July
07
Jul
29
29
2016
11:21 AM
11
11
21
AM
PDT
Dionisio: Corrected! :)gpuccio
July 29, 2016
July
07
Jul
29
29
2016
11:06 AM
11
11
06
AM
PDT
@193 "Now, the point is: these cases exit, but they are definitely rare." "If we can reasonably argue that he changes are functional [...]" "So, what can we say of Dave Carlson’t example? If we have no data bout conservation, [...]"Dionisio
July 29, 2016
July
07
Jul
29
29
2016
11:01 AM
11
11
01
AM
PDT
gpuccio @193 Glad you've explained the concept of "positive selection", which I had not understood before. Maybe I've got it now? :) Thank you.Dionisio
July 29, 2016
July
07
Jul
29
29
2016
10:41 AM
10
10
41
AM
PDT
@189 gpuccio quoted one of his politely-dissenting tsz interlocutors saying:
I assume this because you were the only person I was really talking to last time I was at UD.
Well, wrong assumption (again). They make wrong assumptions too often. :) Even in the cases where gpuccio has been the explicitly addressed person, in public discussion threads other persons may read those comments too and could tip the administrators of the blog, or even request the banning. gpuccio is well known in UD for his patient and polite treatment of his dissenting interlocutors. He even publicly misses debating with those interlocutors, probably because they make him write more to the benefit of the anonymous visitors (a.k.a. onlookers or lurkers) who may draw their own conclusions based on what they read here. I lack that kind of patience, hence I consider some interlocutors unnecessary distractions and welcome their absence. Good riddance. :) Biology-related discussions demand serious concentration. Participants should be willing to understand all presented positions well. Otherwise should go back to their natural habitat in the beautiful Norwegian fjords. :)Dionisio
July 29, 2016
July
07
Jul
29
29
2016
08:43 AM
8
08
43
AM
PDT
Now, let's go to serious things. Dave Carlson, at TSZ, gives this very interesting answer to dazz:
dazz: Question for the experts. Is there any know case in which a relatively or even highly conserved protein(s) at some point in some lineage have undergone a faster evolution? IOW, some protein has stopped being subject to significant selective pressure? Yes, there are many such examples. In fact, one popular method for identifying sequences under positive selection is to look for lineages in a phylogeny that have undergone relatively rapid amino acid substitution in comparison with lineages in which the sequence is more highly conserved is (more specifically, looking for elevated rates of non-synonymous substitutions compared to synonymous substitutions). Here is an example of such a case from my own research (as yet unpublished, though I’m working on it!). This is small snippet of an alignment of tropomyosin sequences found in six spider species from 3 genera (two species per genera). There are very few amino acid substitutions in any of the species over most of the length of the sequence, but here toward the middle there is a relatively high concentration of non-synonymous substitutions in two of the species from the same genera. This is a pretty strong signal of positive selection acting in the ancestor of these two species. There many more published examples of this kind of analysis.
He also gives the alignment. Of course, dazz is ready to equivocate: "So in this case, a loss of conservation led to speciation, or played a roll in it. I guess that would imply a loss of functional restraint and therefore, a loss of information according to gpuccio?" It's strange how some self-proclaimed darwinists understand so little of their own theory. So, let's discuss a little this misunderstood point: positive selection. Now, positive selection is a strange paradox: it should be the strong mechanism in all neo darwinian explanation, the core of the theory. It is, indeed, the basic foundation of the neo darwinian theory; that RV which generates a reproductive advantage is positively selected, expanded and fixed. But the paradox is that positive selection is really, really rare. Let's understand better. There is a kind of positive selection that we can observe directly. It is well documented, and is found in those well known cases of "microevolution"; simple antibiotic resistance, S hemoglobin in malaria zones, the rugged landscape experiment with phages, and so on. Well, they are not so many, after all. But they exist, they are well documented, and they are well understood. What do they have in common? They are all about small transition, a few bits, one, two aminoacids, maybe sometimes three or four. In no case those transitions, while giving some reproductive advantage in the appropriate environment, are steps to some more complex transition. They remain, absolutely, micro-transitions. But there is also an indirect way to "detect" positive selection. The idea is, is some sequences change more that they should by neutral variation, or if some sequences are conserved for long times, and suddenly they change a lot, then we can infer positive selection: those changes are positively retained because they give some advantage. The simplest way is to compare non synonymous mutation rate with synonymous mutation rate ( which is a measure of neutral variation): the Ka-Ks ratio. If the ratio is much lower than 1, then the sequence is conserved, and is subject to negative selection. That's the case with almost all proteins genes, with different degrees of low Ka-Ks ratio. On the other hand, if the ratio is higher than 1, positive selection is assumed. That's exactly the case with Dave Carlson's example. He does not give us an explicit Ka-Ks analysis, but he observes "elevated rates of non-synonymous substitutions compared to synonymous substitutions", which is the same idea. Now, the point is: these cases exist, but they are definitely rare. An important example, which applies a similar concept. is the detection of HARs in humans: segments of the human genome that are conserved throughout vertebrate evolution but are strikingly different in humans. So, what does the example presented by Dave Carlson mean? Is it an example of possible design inference? Well, it's not so simple. First of all, we have to try to infer if the changes documented here (in spiders) can be reasonably considered functional. There are many ways to do that. One would be, as I have done for my proteins, to check if the changed sequences have been conserved for a long enough time. Another way (more difficult) would be to understand the function of those changed sequences. IOWs, we should try to exclude the possibility that the changes are simply random neutral variation (maybe the result of some random insertion). If we can reasonably argue that the changes are functional (and I am absolutely open to that interpretation) the question remains: should they, in the light of ID theory, be considered as some random variation subject to positive selection, or can we infer a design intervention? According to ID theory, the answer is simple: we have to measure the functional complexity of the transition. Only a very high level of functional transition allows a design inference. How high? Well, 500 bits is Dembski's UPB. That is a safe threshold, and it corresponds to about 115 necessary AAs. I have suggested a more realistic threshold of 150 bits for the biological world (about 35 necessary AAs). Empirical data (Behe, Axe) suggest that the real threshold could be more near to 5 - 10 necessary AAs (22 - 43 bits). I will stick here to my 150 bits as a safe threshold. So, what can we say of Dave Carlson't example? If we have no data about conservation, or about function, we can only compute the total potential information content of the transition: 20^n. That is the search space, not the functional information. However, the functional information in a sequence can never be higher than the search space, so the search space is an upper limit for it. Now, I don't know how many AAs are implied in the positive selection. If I look at his image, I would say only a few, maybe more than 10, but not much more. But his image could be only part of the whole process. However, under at least 35 AAs implied in the transition, I would not consider a design inference. That's not to say that the transition was not designed (as our enemies often argue, correctly, it is impossible to exclude design), but certainly there is no real reason to invoke design for such a simple transition, which could still be explained, with some concessions, by non design mechanisms. Moreover, in the absence of functional data, we can only say that the search space is an upper limit for functional information, but the true value could be much lower. One more reason not to try any design inference here. I hope that is clear. This is an important point which is often misunderstood (not only by dazz).gpuccio
July 29, 2016
July
07
Jul
29
29
2016
08:27 AM
8
08
27
AM
PDT
gpuccio @189
However, as a convinced neo darwinist, you should have more faith in coincidences. :)
Yes, good point. How else can they support all the nonsense they write? :)Dionisio
July 29, 2016
July
07
Jul
29
29
2016
08:06 AM
8
08
06
AM
PDT
gpuccio @188 Hard to tell why that "politely dissenting" interlocutor does not seem to understand your clear explanations. Could it be that their motives are...? Oh, no, let's not talk about that now. :)Dionisio
July 29, 2016
July
07
Jul
29
29
2016
07:56 AM
7
07
56
AM
PDT
PeterJ @180 [follow-up to comments @181-184] Page 6.
Discussion [second paragraph]
Although the paucity in LUCA of genes for aminoacid and nucleoside biosynthesis could, in principle, be attributable to post-LUCA LGT, we note that there is no viable alternative to the view that LUCA, regardless of how envisaged, ultimately arose from components that were synthesized abiotically via spontaneous, exergonic syntheses somewhere during the history of early Earth.
[beginning of right column]
Prior to the origin of genes, proteins and the code, LUCA's origin was hence dependent on spontaneous organic syntheses, which are thermodynamically favourable under the high H2 activities of submarine hydrothermal vents, and which still occur today in some geochemical environments.
Where's the beef? Can they reproduce all that in a lab? Where is the detailed description of the spatiotemporal mechanisms that produce all that?Dionisio
July 29, 2016
July
07
Jul
29
29
2016
07:42 AM
7
07
42
AM
PDT
Alicia Cartelli (at TSZ): "Hi Pucci! I’ve missed you as well! Unfortunately I can’t comment on your post because (I’m assuming) you had me blocked. I assume this because you were the only person I was really talking to last time I was at UD. I even tried to make a new account while you were gone for a bit, but it appears you guys know my IP." Alicia, glad to find you in your usual form! You can believe me or not, but I never "had you" (or anyone else) "blocked. Alan Fox: "I suspect Barry Arrington keeps a tight hold on the reins of power. It is simple to block IPs from the admin dashboard. I doubt gpuccio had a hand in it." Well, it seems that Alan Fox believes me. :) Alicia Cartelli: "I guess, but my silent ban coinciding with gpucci’s immediate absence after talking to him for a while was a bit too much of a coincidence for me. Oh well." And you apparently don't. Not too much, at least. OK, I can live with that, I suppose. However, as a convinced neo darwinist, you should have more faith in coincidences. :)gpuccio
July 29, 2016
July
07
Jul
29
29
2016
07:40 AM
7
07
40
AM
PDT
dazz (at TSZ):
I understand that conservation strongly suggests negative selective pressure and therefore, functional constraint. I also understand that neutral evolution is an indication of lack of function. What I don’t think is meaningful is your metric for “functional information”: Is seems odd to me to quantify “functional information” by comparing proteins that retain the same function. For all I know most of the “information” that makes a shark different to a bony fish, or a mouse, is in gene regulation anyway.
Again, you seem not to understand. In my OP I am comparing proteins in shark (and bony fish) to all pre-vertebrates. That's where the jump takes place. IOWs, those 6000+ bits of information which appear in sharks are nowhere to be seen before. So, the reasoning has two steps: 1) There is a sudden jump in the first vertebrates 2) Those 6000+ bits are conserved up to humans The two thin gs together make my argument. My argument is about information which appears rather suddenly and which must be functional because, after it appears, it is conserved. Is that so difficult to understand? Of course those six proteins in vertebrates must have some function different from the function of their weak homologues in pre-vertebrates. And that function must be specific for vertebrates, all vertebrates. A reasonable hypothesis is that neuronal migration in vertebrates is regulated differently in vertebrates, and that the new functional information in those six proteins, which are implied in neuronal migration, is related to the vertebrate body plan. So, why do you speak of "proteins that retain the same function"? My point is that the jump is due to a different function, and that the conservation after the jump is due to the maintenance of the new function. Is it clear?gpuccio
July 29, 2016
July
07
Jul
29
29
2016
07:35 AM
7
07
35
AM
PDT
gpuccio @185
1) Please, look at Figure 1 in the paper, and just tell if that reminds you of anything (isolated islands of function?) :)
Esattamente! They just call them 'disconnected regions' but your term 'IIOF' seems more accurate. [Top of Page 5]
Figure1: Reversible causal graphs demonstrate that 'you can’t always get there from here’ as the state space is composed of many disconnected regions
Dionisio
July 29, 2016
July
07
Jul
29
29
2016
06:51 AM
6
06
51
AM
PDT
GP, IIRC, I got that from you first. KFkairosfocus
July 29, 2016
July
07
Jul
29
29
2016
05:18 AM
5
05
18
AM
PDT
Dionisio and others: I have no time now, but I have just noticed this new thread: https://uncommondescent.com/origin-of-life/davies-and-walker-life-not-reducible-to-known-physical-principles/ and the very interesting paper linked. Just two brief comments: 1) Please, look at Figure 1 in the paper, and just tell if that reminds you of anything (isolated islands of function?) :) 2) Beautiful quote from Einstein in the paper: "One can best feel in dealing with living things how primitive physics still is." And a question: is invoking "new laws of physics, hitherto unknown . . . ” (Schroedinger) still "methodological naturalism? Materialism? Physicalism? Or what else? Well, I would say: it really depends on what those "new laws" look like! :) And if the "hard problem of life" were strictly connected to the "hard problem of consciousness"? And if the "new laws of physics" were strictly connected to consciousness and design?gpuccio
July 29, 2016
July
07
Jul
29
29
2016
01:00 AM
1
01
00
AM
PDT
PeterJ @180 [follow-up to comments @181-183] Here’s a summary of the paper contents:
Page 1: – Tracing proteins to LUCA by removing transdomain LGTs. – LUCA’s microbial ecology reconstructed from genomes. Page 2: – Figure 1: Phylogeny for LUCA’s genes – LUCA’s genes point to acetogenic and methanogenic roots – Hydrothermal vents, methyl groups, and nucleoside modifications Page 3: - Figure 2: Taxonomic distribution of LUCA's genes grouped by functional categories Page 4: - Figure 3: LUCA reconstructed from genome data - Spelling out caveats and allowing for some LGT Page 5: - Figure 4: Methyl groups in conserved modified nucleosides and in anaerobic autotroph metabolism Page 6: - Discussion - Methods Page 7: - References Page 8: - References (continuation from page 7)
Dionisio
July 28, 2016
July
07
Jul
28
28
2016
06:36 PM
6
06
36
PM
PDT
PeterJ @180 [follow-up to comments @181-182] Here's a summary of the paper contents: Page 1: - Tracing proteins to LUCA by removing transdomain LGTs. - LUCA's microbial ecology reconstructed from genomes. Page 2: - Figure 1: Phylogeny for LUCA's genes - LUCA's genes point to acetogenic and methanogenic roots - Hydrothermal vents, methyl groups, and nucleoside modifications Page 3: - and so on... Can you open it?Dionisio
July 28, 2016
July
07
Jul
28
28
2016
08:51 AM
8
08
51
AM
PDT
PeterJ @180 Here's a link to an ePDF copy of the paper you referenced: http://www.nature.com/articles/nmicrobiol2016116.epdf?referrer_access_token=rFGrUrql0RCI1NwOak1vd9RgN0jAjWel9jnR3ZoTv0MUGiEHcCbkW0uWqU-Z8_VoVX7xGnFSz9mbM_GrJrqWbVaUTMLiv2V8vcdy5s1Z_kNWk2DNZvYfRpderyRUdgGcjW2N7e--kOQ_tjoMdnjpB7nD_v8eNTnS9mSz_D3DD0Q4ChLMUiMh-kJ47PYRGlG_6VguR3sWm0737PS7dYGIfq8Ice44eZOZO9NAhdVv_6XNmvpVTAq0UFGlCQ-yK0k8&tracking_referrer=www.foxnews.comDionisio
July 28, 2016
July
07
Jul
28
28
2016
08:11 AM
8
08
11
AM
PDT
PeterJ @180 Thank you for indirectly, through a pop-sci article, referring to this paper:
Weiss, M. C., F. L. Sousa, N. Mrnjavac, S. Neukirchen, M. Roettger, S. Nelson-Sathi and W. F. Martin, 2016, The physiology and habitat of the last universal common ancestor. Nature Microbiology. vol. 1, Article number: 16116 (2016) doi:10.1038/nmicrobiol.2016.116 http://www.nature.com/articles/nmicrobiol2016116
Did you read this paper completely? Did you understand it?Dionisio
July 28, 2016
July
07
Jul
28
28
2016
07:40 AM
7
07
40
AM
PDT
Fox News has has an article which supposedly reveals what scientists have discovered regarding the origin of life. Perhaps it has already been viewed here, but if not, it's worth looking at. http://www.foxnews.com/science/2016/07/27/study-this-is-where-first-life-on-earth-began.htmlPeterJ
July 28, 2016
July
07
Jul
28
28
2016
05:11 AM
5
05
11
AM
PDT
dazz (at TSZ): "Question for the experts. Is there any know case in which a relatively or even highly conserved protein(s) at some point in some lineage have undergone a faster evolution? IOW, some protein has stopped being subject to significant selective pressure?" Your question is not very clear: are you interested to "faster evolution" in the sense of faster degradation (loss of homology because of neutral varition), or rather in the sense of faster change because of beneficial mutations and positive selection? Rumraket seems to interpret your question in the first sense, while Dave Carlson answers the second meaning. Both their answers are perfectly good. Not so your considerations. To Rumraket, you comment:
Thanks, makes complete sense. Seems to me that in order to provide some meaningful metric of conservation in vertebrates that might help measure functional constrain in vertebrates for those proteins, one would need to align the proteins for ALL current vertebrates and determine what’s conserved across the board. Does that make any sense? I still wouldn’t call that a measure of “functional information”
Wrong. See my answer to John Harshman at #178. And I wonser if you have really understood Rumraket's point. I paste it here for you convenience: "Not that I’m an expert, but off the top of my head, genes involved in eye development in blind cave fish. ... Another thing could be genes involved in tooth formation in toothless birds. ... In general, most pseudogenes would more or less qualify. " Just to make it more clear, he is giving examples of genes which are rapidly degraded because there is no more any need for their function in that species, or simply because they are no more functional at all (pseudogenes). That is simply a confirmation of how powerful random variation is when there are no functional constraints, and absolutely supports my point that conservation through long periods of time is a very good measure of functional information. Is it clear? Now, as soon as I find time, I will comments Dave Carlson's very interesting post about positive selection, which opens new important questions, and then I will go back to your comments about that.gpuccio
July 28, 2016
July
07
Jul
28
28
2016
01:25 AM
1
01
25
AM
PDT
John Harshman (at TSZ):
dazz: "Seems to me that in order to provide some meaningful metric of conservation in vertebrates that might help measure functional constrain in vertebrates for those proteins, one would need to align the proteins for ALL current vertebrates and determine what’s conserved across the board. Does that make any sense?" I think that at the very least you would have to ignore pseudogenes. I presume the question would be how much a protein could vary and still retain the same function. Proteins that have lost all function (really, genes that don’t produce proteins any more) should not count, nor should proteins that have significantly altered function. And now you have to ask what “significantly” means. Even proteins like cytochrome c, which does the same thing in all eukaryotes, must adapt to their cellular environments: operating temperatures, pH, changes in associated proteins, etc. It’s not a simple question.
I absolutely agree with you! My reasoning in the OP makes the following very reasonable assumptions: 1) If we find 1000+ bits of homology between shark and humans, we can really infer that that sequence was more or less present in a common ancestor. I think you understand why better than others. 1000+ bits of homology have practically zero probability of being a random observation. There is no way to explain them other than by conservation through evolutionary history. Alicia Cartelli, in the past, tried "convergent evolution", but after my rather fierce answer she seems to have renounced to that theory. I think you can understand that if we start explaining homologies in proteins by convergent evolution when it is convenient, the whole castle of evolutionary theory becomes senseless. 2) What can we say of other species in the same class of organisms that have lower homologies? For example, blasting ASTN2 vs bony fish, we get a wide range of hits, ranging from 1215 to 2009 (and there can be certainly other minor hits which are not shown in the main blast results). Now, as you say, it's not a simple question. As you say, pseudogenes should not be considered, and the lowest hits, if present, are probably trivial. Other hits are with ASTN1. The hits with molecules labeled as ASTN1 show a rather consistent range, 1215 - 1271. Which is consistent with the partial homology between the two molecules in all species. However, even hits with ASTN2 molecules vary, from 1294 to 2009. That could be explained as some loss of information in some species, which is possible, or simply as special adaptations. As you say, "Even proteins like cytochrome c, which does the same thing in all eukaryotes, must adapt to their cellular environments". Whatever the cause, it has no relevance for the basic fact that the sequence with maximum homology in sharks must have been passed from the precursor of cartilaginous and bony fish to humans. And of course, as there is no reason to think that it was not subject to neutral variation in those 400+ million years, it was certainly conserved by negative selection. 3) What can we say of the variation which occurs after? Of course, there are further increases in the homology to humans in snakes, birds, and mammals. I have not interpreted them in my OP because, as you say, it's not a simple problem. I think that much of those further increases is functional, and is due to new, or simply different, functional requirements in the new classes of organisms. I have just given the values in mouse, and in humans to give the bitscore of identity, as general information. But all my reasoning is about the information jump between pre-vertebrates and the first vertebrates.gpuccio
July 28, 2016
July
07
Jul
28
28
2016
01:11 AM
1
01
11
AM
PDT
1 2 3 4 5 6 10

Leave a Reply