Uncommon Descent Serving The Intelligent Design Community

Interesting proteins: DNA-binding proteins SATB1 and SATB2

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

With this OP, I am starting a series (I hope) of articles whose purpose is to present interesting proteins which can be of specific relevance to ID theory, for their functional context and evolutionary history.

DNA-binding protein SATB1

SATB1 (accession number Q01826) is a very intriguing molecule. Let’s start with some information we can find at Uniprot, a fundamental protein database, about what is known of its function (in the human form):

Crucial silencing factor contributing to the initiation of X inactivation mediated by Xist RNA that occurs during embryogenesis and in lymphoma

And:

Transcriptional repressor controlling nuclear and viral gene expression in a phosphorylated and acetylated status-dependent manner, by binding to matrix attachment regions (MARs) of DNA and inducing a local chromatin-loop remodeling. Acts as a docking site for several chromatin remodeling enzymes

IOWs, it is an important regulatory protein involved in many different, and not necessarily well understood, processes, which binds to DNA and in involved in chromatin remodeling.

It is also involved in hematopoiesis (especially in T cell development), and has important roles in the biology of some tumors:

Modulates genes that are essential in the maturation of the immune T-cell CD8SP from thymocytes. Required for the switching of fetal globin species, and beta- and gamma-globin genes regulation during erythroid differentiation. Plays a role in chromatin organization and nuclear architecture during apoptosis.

Reprograms chromatin organization and the transcription profiles of breast tumors to promote growth and metastasis.

Keywords for molecular function: Chromatin regulatorDNA-bindingRepressor

Now, some information about the protein itself. I will relate, again, to the human form of the protein:

Length: 763 AAs. It’s a rather big protein, like many important regulatory molecules.

Its subcellular location is in the nucleus.

It is a multi-domain protein, with at least 5 detectable domains and many DNA binding sites.

Evolutionary history of SATB1

Now, let’s see some features of the evolutionary history of this protein in the course of metazoa evolution.

I will use here the same tools that I have developed and presented in my previous OP:

The amazing level of engineering in the transition to the vertebrate proteome: a global analysis

So, I invite all those who are interested in the technical details to refer to that OP.

Here is a graph of the levels of homology to the human protein detectable in other metazoan groups, expressed as mean bitscore per aminoacid site:

 

Fig. 1: Evolutionary history of SATB1 by human-conserved functional information

 

The green line represents the evolutionary history of our protein, while the red dotted line is the reference mean line for the groups considered, as already presented in my previous post quoted above (Fig. 2).

As everyone can see, this specific protein has a very sudden gain in human-conserved information with the transition from pre-vertebrates to vertebrates. So, it represents a very good example of the information jump that I have tried to quantify globally in my previous post.

Here, the jump is of almost 1.5 bits per aminoacid site. What does that mean?

Let’s remember that the protein is 763 AA long. Therefore, an increase of information of 1.5 bits per aminoacid corresponds to more than 1000 bits of information. To be precise, the jump from the best pre-vertebrate hit to the best hit in cartilaginous fish is:

1049 bits

But let’s see more in detail how the jump happens.

I will show here in detail some results of protein blasts. All of them have been obtained using the Blastp software at the NCBI site:

https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome

with default settings.

Here is the result of blasting the human protein against all known protein sequences except for vertebrate sequences:

Fig. 2: Results of blasting human SATB1 against all non vertebrate protein sequences

 

As can be seen, we find only low homologies in non vertebrates, and they are essentially restricted to a small part of the molecule, that correspond to the first two domains in the protein, or just to the first domain. The image shows clearly that all the rest of the sequence has no detectable significant homologies in non vertebrates (except for a couple of very low homologies for the third domain).

The best hit in non vertebrates is 154 bits with Parasteatoda tepidariorum, a spider. Here it is:

Fig. 3: The best hit in non vertebrates (with a spider)

The upper line (Query) is the human sequence. The bottom line (Sbjct) is the aligned sequence of the spider. In the middle line, letters are identities, “+” characters are similarities (substitutions which are frequently observed in proteins, and are probably quasi-neutral), and empty spaces are less frequent substitutions, those that are more likely to affect protein structure and function if they happen at a functionally important aminoacid site.

The alignment here is absolutely restricted to AAs 71 – 245 (the first two domains), and involves only 177 AAs. Of these, only 78 (44%) are identities and 111 (62%) are positives (identities + similarities). So, in the whole protein we have only 78 identities out of 763 (10.2%).

The spider protein is labeled as “uncharacterized protein”, and that is the case in most of the other non vertebrate hits.

All the other non vertebrate hits, with a couple of exceptions, are well below 100 bits, most of them between 70 and 86 bits.

IOWs, the protein as we know it in vertebrates essentially does not exist in non vertebrates.

Even non vertebrate deuterostomia, which should be the nearest precursors of the first vertebrates, have extremely low homology bitscores with the human protein:

Saccoglossus kowalevskii (hemichordates):  87 bits

Branchiostoma floridae (cephalochordate): 67 bits

The information jump in vertebrates

Now, what happens with the first vertebrates?

The oldest split in vertebrates is the one between cartilaginous fish and bony fish (from which the human lineage derives). Therefore, homologies that are conserved between cartilaginous fish and humans had reasonably to be already present in the Last Common Ancestor of Vertebrates, before the split between cartilaginous fish and bony fish, and have been conserved for about 420 million years.

So, let’s see the best hit between the human protein and cartilaginous fish. It is with Rhincodon typus (whale shark). Here it is:

 

Fig. 4: The best hit of human SATB1 in cartilaginous fish (with the whale shark)

 

Here, the alignment involves practically the whole molecule (756 AAs), and we have 1203 bits of homology, 603 identities (79%), 659 positives (86%).

IOWs, the two molecules are almost identical. And the homology is extremely high not only in the domain parts, but also in the rest of the protein sequence.

Now, the evolutionary time between pre-vertebrates and the first split in vertebrates is certainly rather small, a few million years, or at most 20 – 30 million years. Not a big chronological window at all, in evolutionary terms.

However, in that window, this protein appears almost complete. 603 aminoacids are already those that will remain up to the human form of the protein, and only 78 of them were detectable in the best hit before vertebrate appearance.

1049 bits of new, original functional information. In such a short evolutionary window.

Functionality

Why functional? Because those 603 aminoacid have remained the same thorugh more than 400 million years of evolution. They have evaded neutral or quasi neutral variation, that would have certainly completely transformed the sequence in such a big evolutionary time, if those aminoacid sites were not under extreme functional constraint and purifying (negative) selection.

Now, I say that this fact cannot in any way be explained by any neo-darwinian model. Absolutely not.

Moreover, there is absolutely no evidence in the available proteome of any intermediate form, of any gradual development of the functional sequence that will be conserved up to humans (except, of course, for the 50 – 78 AAs which are already detectable in the first two domains in many pre -vertebrates).

By the way, Callorhincus milii, the Elephant shark, has almost identical values of homology:

1184 bits, 599 identities, 654 positives

But, how important is this protein?

In the ExAC database, a database of variations in the human genome, missense mutations are 110 out of 260.3 expected, with a z score of 4.56, an extremely high measure of functional constraint.

The recent medical literature has a lot of articles about the important role of SATB1 at least in two big fields:

  • T cell development
  • Tumor development (many different kinds of tumors)

If we want to sum up in a few words what is known, we could say that SATB1 is considered a master regulator, essentially a complex transcription repressor, involved mainly (but not only) in the development of the immune system, in particular T cells. A disregulation of this protein is linked to many aspects of tumor invasivity (especially metastases). The protein seems to act, among other possibilities, as a global organizer of chromatin states.

Here is a very brief recent bibliography:

Essential Roles of SATB1 in Specifying T Lymphocyte Subsets

SATB1 overexpression correlates with gastrointestinal neoplasms invasion and metastasis: a meta-analysis for Chinese population

SATB1-mediated Functional Packaging of Chromatin into Loops

DNA-binding protein SATB2

But there is more. There is another protein which is very similar to SATB1. It is called DNA-binding protein SATB2 (accession number Q9UPW6).

Its length is very similar to SATB1: 733 AAs.

Uniprot describes its function as follows:

Binds to DNA, at nuclear matrix- or scaffold-associated regions. Thought to recognize the sugar-phosphate structure of double-stranded DNA. Transcription factor controlling nuclear gene expression, by binding to matrix attachment regions (MARs) of DNA and inducing a local chromatin-loop remodeling. Acts as a docking site for several chromatin remodeling enzymes

Which is very similar to SATB1. But now come the differences. While SATB1 is implied prevalently in T cell development and tumor development, SATB2 is:

Required for the initiation of the upper-layer neurons (UL1) specific genetic program and for the inactivation of deep-layer neurons (DL) and UL2 specific genes, probably by modulating BCL11B expression. Repressor of Ctip2 and regulatory determinant of corticocortical connections in the developing cerebral cortex. May play an important role in palate formation. Acts as a molecular node in a transcriptional network regulating skeletal development and osteoblast differentiation

So, similar proteins with rather different specificities. While SATB1 is mainly connexted to adaptive immunity (T cell development), SATB2 seems to be more linked to neuronal development. Like SATB1, it is involved in cancer development, although usually in different types of cancer.

Here is a brief recent bibliography about SATB2:

Mutual regulation between Satb2 and Fezf2 promotes subcerebral projection neuron identity in the developing cerebral cortex

SATB1 and SATB2 play opposing roles in c-Myc expression and progression of colorectal cancer

However, how similar is SATB2 to SATB1 in terms of sequence homology?

Here is a direct blast of the two human molecules:

 

Fig. 5: Blast of human SATB1 vs human SATB2:

 

OK, they are very similar, but…  only 460 identities, 550 positives, 854 bits. IOWs, these two human proteins are similar, but not so similar as the two sequences of SATB1 in the shark and in humans.

Now, here is the evolutionary history of SATB2:

 

Fig. 6: Evolutionary history of SATB2 by human-conserved functional information

 

As everyone can see, it is almost identical to the evolutionary history of SATB1. To see it even better, Fig. 7 shows the two evolutionary histories together (the green line is SATB1, the brown line is SATB2):

 

Fig. 7: Evolutionary history of SATB1 and SATB2 by human-conserved functional information

 

In particular, pre-vertebrate history and the jump in cartilaginous fish are practically identical. And yet these are two different molecules, as we have seen, with different specificities and about one third of difference in sequence.

Now, let’s blast human SATB2 against cartilaginous fish. Again the best hit is with the whale shark:

 

Fig. 8: The best hit of human SATB2 in cartilaginous fish (with the whale shark)

 

And the numbers are very similar, incredibly similar I would say, to those we found for SATB1:

1197 bits, 592 identities, 662 positives.

But what if we blast SATB1 of the whale shark against SATB2 of the whale shark?

Here are the results:

 

Fig. 9: Blast of whale shark SATB1 vs whale shark SATB2:

Now, please, compare the numbers we got here with those from the similar blast between the two proteins in humans:

SATB1 human vs SATB2 human:  460 identities, 550 positives, 854 bits

SATB1 shark vs SATB2 shark:      468 identities, 556 positives, 856 bits

Almost exactly the same numbers! Wow!

What does that mean?

It means that this system of two similar proteins with different function arises in vertebrates as a whole system, already complete, with the two components already differentiated, and is conserved almost identical up to humans. Indeed, SATB1 and SATB2 have the same degree of homology both in sharks and in humans, and the two SATB1 proteins in shark and humans, as well as the two SATB2 proteins in shark and humans, have greater similarity, after more than 400 million years of divergence, than SATB1 and SATB2 show when compared, both in sharks and in humans.

Would you describe that as sudden appearance of huge amounts of functional information, followed by an extremely long stasis? I certainly would!

The following table sums up these results:

Sequence 1 Sequence 2 Bitscore
SATB1 Human SATB2 Human 854
SATB1 Shark SATB2 Shark 856
SATB1 Human SATB1 Shark 1203
SATB2 Human SATB2 Shark 1197

IOWs, the whole system appeared practically as it is today, before the split of cartilaginous fish and bony fish, and has retained its essential form up to now.

So, the total amount of new functional information implied by the whole system of these two proteins is about 1545 bits (considering 855 bits of common information, and 345 bits x 2 of specific information in each molecule).

An amazing amount, for a system of just two molecules, considering that 500 bits is Dembski’s Universal Probability Bound!

Let’s remember that in my previous post, quoted above, I showed that the informational jump from pre-vertebrates to vertebrates is more than 1.7 million bits. That’s a very big number, but big numbers sometimes are not easily digested. So, I believe that seeing that just two important molecules can contribute for almost 1500 bits can help us understand what we are really seeing here.

Moreover, it’s certainly not a case that those two molecules seem to be fundamental in two very particular fields:

a) The adaptive immune system

b) The nervous system

if we consider that those are exactly the two most relevant developments in vertebrates.

And, as a final note, please consider that these are very complex master regulators, which interact with tens of other complex proteins to effect their functions. The whole system is certainly much more irreducibly complex than we can imagine.

But still, just the analysis of these two sister proteins is more than enough to demonstrate that the neo Darwinian paradigm is completely inappropriate to explain what we can see in the proteome and in its natural history. And this is only one example among thousands.

So, I want to conclude repeating again this strong and very convinced statement:

The observed facts described here cannot in any way be explained by any neo-darwinian model. Absolutely not. They are extremely strong evidence for a design inference.

Comments
wd400 has argued that gaps are expected. For instance here (2014):
... you wouldn’t expect to see intermediates if there were a set paths from A -> B -> … -> X, because intermediates will be replaced by more favoured variants. The branching nature of evolutionary process creates gaps in extant species/proteins/genes.
If huge gaps, like the one between non vertebrates and the first vertebrates (WRT proteins SATB1 and SATB2), are to be expected, why is there no such gap between the whale shark and humans? Wd400 wants it both ways. If there are (huge) gaps, then this is expected and when there are no gaps then this is evidence for blind watchmaker evolution.Origenes
July 24, 2017
July
07
Jul
24
24
2017
03:26 PM
3
03
26
PM
PDT
Well, tomorrow I hope that I can finish my verbose comments on wd400's previous post, and maybe add a few (verbose) words about his last one. For the moment, before going to sleep, I can only thank him again for being such a good inspiration for my verbose and futile activities. :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
03:09 PM
3
03
09
PM
PDT
Origenes: "Don’t worry it’s not your fault. There are no valid counter-arguments." :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
03:07 PM
3
03
07
PM
PDT
Mung: Just to confirm what is written in the genomewiki link, here are two rather different opsins that both present a significant informational jump in vertebrates: a) Rhodopsin (P08100). 348 AAs. Jump from non vertebrates to vertebrates: 1.091954 bits per aminoacid site 380 bits b) Long-wave-sensitive opsin 1 (P04000). 364 AAs. Jump from non vertebrates to vertebrates: 0.8901099 bits per aminoacid site 324 bits But, of course, such verbose statements can have no interest for your kind interlocutors at TSZ, or for wd400, or for any other thinking person in the neo-darwinian field. And, after all, as wd400 says, "variations that occurred in that time are not available for study when we look at modern organisms". But we have to believe that the intermediates existed, if he says so.gpuccio
July 24, 2017
July
07
Jul
24
24
2017
03:06 PM
3
03
06
PM
PDT
wd400: It's fine with me.gpuccio
July 24, 2017
July
07
Jul
24
24
2017
02:55 PM
2
02
55
PM
PDT
gp, these are not so much new as trying to underline what I referred to in the earlier comment. You are welcome to reply, but I am unlikely to follow anything up.wd400
July 24, 2017
July
07
Jul
24
24
2017
02:42 PM
2
02
42
PM
PDT
Well, wd400, again after stimulation, has expressed some new thoughts. Unfortunately, I still have to finish my reasonings about his previous post. It will be difficult to stay updated. After all, my posts are "verbose", so it's all my fault! :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
02:38 PM
2
02
38
PM
PDT
Mung: I will give a look at opsins. It seems a very complex issue! :) From the genomwiki page you linked, this seems interesting for our discussion here:
Early deuterostomes -- represented today by living echinoderms, hemichordates, cephalochordates and urochordates -- retained various opsin classes descended from the ur-bilateran (indeed eumetazoan) ancestor but never possessed imaging vision nor subsequently developed it except in one descendent lineage (vertebrates). One tunicate opsin specialized in the direction of parapinopsin but the main expansion and maturation of the opsin gene family took place very rapidly in the lamprey stem. Indeed at the time of divergence with jawed vertebrates, the last common ancestor possessed a full set of modern opsin genetic loci, furnishing four-color imaging ciliary opsin-based cone and rod vision with advanced oil-drop filtration. Intermediary states in chordate imaging vision development are no longer represented among extant species unless hagfish provides an intermediate node or better opsin retention occurs in additional urochordate genomes.
Emphasis mine. Does that seem familiar? :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
02:32 PM
2
02
32
PM
PDT
Doesn't take much googling to find places where I've tried to explained the relevance of phylogeny and the limitations of taking BLAST databases as complete records of diversity (eg https://uncommondescent.com/intelligent-design/an-attempt-at-computing-dfsci-for-english-language/, https://uncommondescent.com/intelligent-design/homologies-differences-and-information-jumps/). THere are probably others if you look hard enough. Reading those posts demonstrates the futility of this though, none of the comments (or, frankly, the ones I made above) have made a dent in the way these posts are written. The question of finding (or, actually inferring) ancestral intermediates is a strange one. First, such intermediates obviously exist, bacause the proteins are not 100% conserved, allowing us to infer ancestral states. If you look only at those amino acids are conserved then ask for the intermediates then obviously we won't find them. It is also strange to ask think discontinuous jumps between clades is a problem and not a prediction of evolution down a tree. All vertebrates share ~30 million years of evolutionary history, variation that occured in that time are not available for study when we look at modern organisms. The idea that not finding homologies for all domains in non- (not pre!) vertebrate animals using blastp and default settings is evidence that these domains were not present in the ancestors of vertebrates is also strange. These domains are all present in modern non-vertebrates. Again, I don't wish to get into a point-by-point discussion with posts that are as verbose as these ones, especially given how futile it seems to be.wd400
July 24, 2017
July
07
Jul
24
24
2017
02:19 PM
2
02
19
PM
PDT
Origenes, I really appreciate that you've called out WD400 straight up -- no BS. If we have to suffer the dull headed face-painters like rvb (and others), then surely those that can actually engage the conversation should be called out for standing silent on the sidelines when ID proponents like Dr. Puccio make their arguments.Upright BiPed
July 24, 2017
July
07
Jul
24
24
2017
01:22 PM
1
01
22
PM
PDT
wd400: I don’t really have the time to read it all, but at a glance it seems like a lot of these posts that I have commented on in the past.
You are mistaken wd400, you did no such thing. GPuccio has written several posts about the evolutionary history of proteins, but up till now you have studiously ignored all of them. You may like the idea that you have offered counter-arguments in the past, but that's all pure fantasy, in fact, that never happened. Don't worry it's not your fault. There are no valid counter-arguments.Origenes
July 24, 2017
July
07
Jul
24
24
2017
10:31 AM
10
10
31
AM
PDT
Hello gpuccio, Could I interest you in a side project. :) Could you perhaps take a look at opsins with particular respect to snails? http://theskepticalzone.com/wp/eye-mock-stupidity/ (Assuming you haven't already done a post on opsins!) The author of the book from which I quote in my OP over there at "The Charitable Zone" seems to think that opsins provide evidence for gradual evolution of the eye via natural selection. Yikes! http://genomewiki.ucsc.edu/index.php/Opsin_evolution:_orgins_of_opsinsMung
July 24, 2017
July
07
Jul
24
24
2017
10:11 AM
10
10
11
AM
PDT
gpuccio @171:
we should be able: a) to find at least some intermediate (not every intermediate [missing ')'?] b) between the ancestor protein present in pre-vertebrates (presumably in chordates) and c) the sequence which had to be already present in the last common ancestor of vertebrates, and which has been conserved after that both in sharks and in the human lineage, after the split between cartilaginous fish and bony fish. Not every intermediate, but at least some intermediate (instead of none at all) We need intermediates not to the modern protein, but to the sequence that was already present in the last common ancestor of vertebrates, more than 400 million years ago, and that has been conserved thereafter. That sequence is, in no sense, “modern”.
Very clear point. So clear that even I can understand it. I believe that some folks may not understand it because they just don't want to. The will to understand is required.Dionisio
July 24, 2017
July
07
Jul
24
24
2017
08:48 AM
8
08
48
AM
PDT
Dionisio: I have almost a veneration for software developers! :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
06:17 AM
6
06
17
AM
PDT
Answers to wd400:
There is no regard for the fact these genes have evolved down a species tree,
I really don't understand the reasons for this strange statement. All my reasoning is based on common descent and on the concept of a species tree. I have tried at my best the commonly accepted ideas and timelines for the evolutionary tree that brought to vertebrates and on. I have tried to localize as precisely as possible the window of evolutionary time where the informational jump takes place. The concept of a jump itself has no meaning, if not in the light of such an evolutionary tree. So, why is my interlocutor stating that in my reasoning "there is no regard for the fact these genes have evolved down a species tree"? I don't understand.
the idea that you should be able to find every intermediate between an ancestral and moodern protein is obviously wrong as soon as you start thinking about trees.
Why should "thinking about trees" make my ideas suddenly wrong? And, of course, I have never complained that we "should be able to find every intermediate between an ancestral and modern protein". This is really obfuscation of my real thoughts. What I complain of is that we should be able: a) to find at least some intermediate (not every intermediate b) between the ancestor protein present in pre-vertebrates (presumably in chordates) and c) the sequence which had to be already present in the last common ancestor of vertebrates, and which has been conserved after that both in sharks and in the human lineage, after the split between cartilaginous fish and bony fish. So, two important errors in wd400 representation of my argument: Not every intermediate, but at least some intermediate (instead of none at all) We need intermediates not to the modern protein, but to the sequence that was already present in the last common ancestor of vertebrates, more than 400 million years ago, and that has been conserved thereafter. That sequence is, in no sense, "modern". So, why should "thinking about trees" instantly obliterate the meaning and the value of this argument? By the way, the essential point of my counter-argument here had already been brilliantly expressed by Origenes in his post #142, in response to wd400. I quote from that: "GPuccio’s argument, as I understand it, is not about finding “every intermediate”, but rather about finding any intermediate. Unless you are willing to argue that, given trees, we should expect information jumps of thousands of bits and no intermediates whatsoever, you do not have a point."gpuccio
July 24, 2017
July
07
Jul
24
24
2017
06:15 AM
6
06
15
AM
PDT
gpuccio, Excellent! Thank you for the approval! And for promoting me up to personal editor of a doctor author. That sounds pretty "important" after being only a software developer for years. :)Dionisio
July 24, 2017
July
07
Jul
24
24
2017
06:15 AM
6
06
15
AM
PDT
Dionisio: Of course you can. And please, go on with your activity as my personal editor. I am grateful for it! :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
05:55 AM
5
05
55
AM
PDT
gpuccio @167: Thank you for the explanation. I need some time to chew and digest it. Needless to say that I'm learning more than I expected from this OP, as I did from the two that preceded. Looking forward to reading the next OP of this series. FYI - I'm copying your OP + follow up explanations into an off-line document so that I can partially quote it in Bioinformatics and/or Systems Biology-related discussions outside UD. Obviously I will cite you as the author and UD as the source of the text. Definitely I won't take any credits for the information in the text. I may have to make minor proofreading corrections in order to make the text presentable to more discriminating audiences. I'll share with you the proposed adjustments. At this point I just ask you for your consent. Thank you. PS. for example, @167:
IOWs, in vertebrate proteins SATB1 and SATB2 the domain seems necessary for the tetramerization of the protein, an important functional step, but has probably meany other functions, as it “may be involved in various interactions with chromatin protein”.
many?
the same superfamily includes the similar domain in drosophila protein DVE, a protein with transcription factor activity and important regulator functions. Frankly, from what is known I would say that the two proteins (SATB proteins in vertebrates and DVE protein in drosophila are both important transcription regulators, but they have very different spectrum of activity.
(SATB proteins in vertebrates and DVE protein in drosophila) are both important [missing ')'] [Emphasis added]Dionisio
July 24, 2017
July
07
Jul
24
24
2017
05:45 AM
5
05
45
AM
PDT
Dionisio: "Are those sequences with recognizable homology with pre-vertebrates functional? If they are, do they perform similar or different functions as in pre-vertebrates?" OK, not a simple question! Let's start with the first domain, the N terminal domain. It is representative of its own superfamily: N-terminal domain of SATB1 and similar proteins This is the information from NCBI: "SATB1, the special AT-rich sequence-binding protein 1, is involved in organizing chromosomal loci into distinct loops, creating a "loopscape" that has a direct bearing on gene expression. This N-terminal domain, which may be involved in various interactions with chromatin proteins, resembles a ubiquitin domain and has been shown to form tetramers, a function critical to SATB1-DNA interactions. The related Drosophila homeobox gene defective proventriculus (dve) plays a key role in the functional specification during endoderm development." IOWs, in vertebrate proteins SATB1 and SATB2 the domain seems necessary for the tetramerization of the protein, an important functional step, but has probably many other functions, as it "may be involved in various interactions with chromatin proteins". the same superfamily includes the similar domain in drosophila protein DVE, a protein with transcription factor activity and important regulator functions. Frankly, from what is known I would say that the two proteins (SATB proteins in vertebrates and DVE protein in drosophila are both important transcription regulators, but they have very different spectrum of activity. The highest homology hit between Drosophila DVE and the human proteome is with SATB2 (75.9 bits, 53%identities), and significant hits are only with SATB proteins, and nothing else. The homology between DEV and SATB proteins is essentially limited to this domain (about 100 AAs). For the rest, they are completely different molecules (human SATB proteins being, as we know, 763 and 733 AAs long, while drosophila DEV is 1021 AAs long. The second domain, CUTL, is again a representative of its own superfamily: CUT1-like DNA-binding domain of SATB This is the information from NCBI: "CUTL is part of the N-terminal region of SATB proteins, special AT-rich sequence-binding proteins that are global chromatin organisers and gene expression regulators essential for T-cell development and breast cancer tumor growth and metastasis. CUTL carries a DNA-binding region just as CUT domains do." As said, it has a significant hit in Parasteatoda tepidariourum (the spider), but no significant hits in drosophila). In the spider, the hit is 83.6 bits, 52% identities. The spider protein is labeled as: "uncharacterized protein LOC107441355", and is 595 AAs long. I don't think that much is known about it. I hope this answers your question. :)gpuccio
July 24, 2017
July
07
Jul
24
24
2017
04:37 AM
4
04
37
AM
PDT
@159:
So, it’s perfectly possible that those sequences come from “recombination” (or any other form or reuse) from pre-vertebrates. With some differences, but a very well recognizable homology.
Are those sequences with recognizable homology with pre-vertebrates functional? If they are, do they perform similar or different functions as in pre-vertebrates?Dionisio
July 24, 2017
July
07
Jul
24
24
2017
12:19 AM
12
12
19
AM
PDT
UB: Thank you, my friend! :)gpuccio
July 23, 2017
July
07
Jul
23
23
2017
08:58 PM
8
08
58
PM
PDT
Excellent work GP. The very best.Upright BiPed
July 23, 2017
July
07
Jul
23
23
2017
08:51 PM
8
08
51
PM
PDT
The very few politely dissenting interlocutors that have dared to debate the discussed topic with gpuccio, have seen their very poor arguments crushed by the clear detailed explanations gpuccio has provided in his OP and comments. Looking forward to reading the next OP in this series.Dionisio
July 23, 2017
July
07
Jul
23
23
2017
08:08 PM
8
08
08
PM
PDT
Popular Posts (Last 30 Days) Information theory is bad news for Darwin: Evolutionary… (1,299) John Sanford: Darwin a figurehead, not a scientist (1,213) Is Mathematics a Natural Science? (Is that important?) (1,210) Is OOL Part of Darwinian Evolution? (1,208) Interesting proteins: DNA-binding proteins SATB1 and SATB2 (1,039) [9 days]Dionisio
July 23, 2017
July
07
Jul
23
23
2017
05:13 PM
5
05
13
PM
PDT
john_a_designer: Behe has always been one of my favourite ID thinkers. He is really a remarkable man and scientist. You say: "So why does Neo-Darwinism persist? I believe it is because of its a-priori ideological or philosophical fit with naturalistic or materialistic world views. Human being are hard wired to believe in something– anything to explain or make some sense of our existence. Unfortunately we also have the tendency to believe in a lot of untrue things." You are right. I am absolutely convinced that, if there were some other "naturalistic" (whatever it may mean) scientific explanation for biological functional complexity, the whole house of cards of neo-darwinism would be readily and happily dismissed as unsupported and unreasonable speculation. Which it is. Unfortunately, the only credible explanation for biological functional complexity remains some form of conscious design. And that cannot be accepted for dogmatic reasons and worldview prejudices. Neo-neo-neo darwinists (the third or fourth way, or whatever) are trying to create some biological "compatibilism", invoking fashionable idols like some teleological principle or force of nature, or who knows what, but of course, like compatibilism, this is only an intellectual trick: purpose and cognition are only experiences of consciousness, and of conscious agents. They simply don't exist in objects, they are experiences and representations of consciousness. So, in a world of deterministic free will, mindless purpose and intelligent stupidity, we can only wait that something change, and that real and true things become acceptable again.gpuccio
July 23, 2017
July
07
Jul
23
23
2017
01:48 PM
1
01
48
PM
PDT
GPuccio @ 122,
But the point is simple enough, if you know ID theory. The point is that huge new functional information is beyond the reach of RV, for obvious reasons. The neo-darwinist argument. that it can be generated gradually through the magic of NS, si completely unsupported by any facts (and by any credible reasoning), and moreover would necessarily imply: a) very long evolutionary times b) detectable traces of intermediate forms from the supposed gradual pathway in the proteome Now, while it is rather intuitive that we have nothing of that kind for any known evolutionary situation, my reasoning in this OP (and in the ones that preceded it) tries to quantify the informational jump in specific proteins in a specific evolutionary context, by explicit objective arguments, so that it is apparent that this particular huge informational jump happened in a rather short evolutionary time, and left no traces in the proteome of any gradual pathway to it…
Indeed, the apparently “rapid” bursts and spurts we see in the evolutionary tree are not readily amenable to the Neo-Darwinian macro-evolutionary narrative. (Darwinism and Neo-Darwinism are better described as narrative that scientific theories.) It seems to me that sabt1 and sabt2, with their functional CSI at some point at least appear to have burst on the scene. Is assessment that correct? Of course, there are other possibilities. There are always other possibilities. In his book Darwin’s Black Box, Michael Behe asks,
“Might there be an as yet undiscovered natural process that would explain biochemical complexity? No one would be foolish enough to categorically deny the possibility. Nonetheless we can say that if there is such a process, no one has a clue how it would work. Further it would go against all human experience, like postulating that a natural process might explain computers… In the face of the massive evidence we do have for biochemical design, ignoring the evidence in the name of a phantom process would be to play the role of detective who ignore the elephant.” (p. 203-204)
Basically Behe is asking, if biochemical complexity (or IC) evolved by some mindless natural process x, how did it evolve? That is a perfectly legitimate scientific question. Notice that even though in DBB Behe was criticizing Neo-Darwinism he is not ruling out some other mindless natural evolutionary process, “x” Behe is simply claiming that at the present there is no known natural process that can explain how irreducibly complex mechanisms and processes originated. If he and other ID’ist are wrong then our critics need to provide the step-by-step-by-step empirical explanation of how they originated, not just speculation and wishful thinking. Unfortunately our regular interlocutors seem to only be able to provide the latter not the former. Behe made another point which is worth keeping in mind.
“In the abstract, it might be tempting to imagine that irreducible complexity simply requires multiple simultaneous mutations - that evolution might be far chancier than we thought, but still possible. Such an appeal to brute luck can never be refuted... Luck is metaphysical speculation; scientific explanations invoke causes.”
In other words, a strongly held metaphysical belief is not a scientific explanation. So why does Neo-Darwinism persist? I believe it is because of its a-priori ideological or philosophical fit with naturalistic or materialistic world views. Human being are hard wired to believe in something-- anything to explain or make some sense of our existence. Unfortunately we also have the tendency to believe in a lot of untrue things.john_a_designer
July 23, 2017
July
07
Jul
23
23
2017
01:19 PM
1
01
19
PM
PDT
Answers to wd400: Now, the second part of the statement I considered in post #157:
these proteins are made from domains that are conserved well beyond the vertebrates, recombination can bring such domains together
Ah, the famous recombination argument. It seems to be the last defense of neo-darwinism, when everything else fails. Zachriel used to recur to it very often. But the point is: recombination is recycling of existing information. And if sequence information is recycled, it is usually recognizable. It is true that in SATB1 abd SATB2 we can recognize, from sequence homology, 5 sequences corresponding to known domains. And we have seen that the first two really have some significant homology with the same domains in pre-vertebrates. Moreover, those two sequences (corresponding to domains SATB1_L and CUTL) have significant homologies with the reference sequences of the relative domains (Expect valuse of 5.68e-49 and 1.98e-44 respectively). This is, therefore, the part of the protein that, in some measure, alredy "existed" in prevertebrates as a rather similar sequence. And I have fully ackowledged that, and computed it (the 154 bits of pre-existing information) in my reasoning. But the two sequences identified as CUT domains in BLAST (domains 3 and 4 in Fig. 2) have much lower homology with the reference sequences of the relative domains (1.46e-18 and 1.61e-18 respectively, more than enough to consider them homologues of the reference domain, but anyway homologues with rather different sequences), and, what's more important, have no detectable homology at all in pre-vertebrates. That means that, even if they could have a structure correspondent to the reference domains (but I doubt that anyone has studied their structure in this particular proteins), the sequence is all new in vertebrates, and only very partially corresponding to the reference domain sequence. And, remember, this particular sequence, as it is, with its differences from the domain reference sequence, is conserved in these proteins for 400+ million years. The scenario is even more apparent for the 5th sequence recognized as domain, the HOX sequence, where the homology with the reference domain sequence is even lower (Expect 2.38e-08), and again no detectable homology in pre-vertebrates can be observed. This, without considering the interdomain sequences, which are conserved too, and do not correspond to any domain and have no detectable homologies in pre-vertebrates. Now my point is: these sequences (those corresponding to known domains) are anyway new in vertebrates, and conserved form then on. This is what the sequence analysis tells us. Could recombination have had some role in their generation? The answer is simple: no, not at all. Why? Because recombination can only remix sequences that already exist. How can recombination help generate new complex sequences, that did not exist before? If the sequences in SATB1 and SATB2 had been generated by recombination of similar sequences in pre-vertebrates, they still would exhibit some homology in pre-vertebrates, especially if those sequences are subject to purifying selection (and they are, because from sharks to humans they are conserved). But those sequences have no homology with sequences in pre-vertebrates. How can we think that they are the result of recombination? Look at the difference with the sequences in the first two domains: those do show homologies, with similar sequences in pre-vertebrates, especially the first domain, which is well represented in that category. The first and second domain together, in the bets pre-vertebrate hit, are responsible for those 154 bits that we had to subtract in our computation of "new" functional information in vertebrates. So, it's perfectly possible that those sequences come from "recombination" (or any other form of reuse) from pre-vertebrates. With some differences, but a very well recognizable homology. Compare that to the complete absence of homology in pre-vertebrates for all the rest of the molecule. Here, certainly, recombination could have no role at all. Those sequences are new in vertebrates, and are retained after the first vertebrate split. They are new and functional. Recombination cannot explain them at all. More in next post.gpuccio
July 23, 2017
July
07
Jul
23
23
2017
11:41 AM
11
11
41
AM
PDT
Very interesting explanation. Thanks.Dionisio
July 23, 2017
July
07
Jul
23
23
2017
04:27 AM
4
04
27
AM
PDT
Answers to wd400:
There is no requirement that each amino acid be selected one after the other (these proteins are made from domains that are conserved well beyond the vertebrates, recombination can bring such domains together).
IMO, these are two different concepts, not necessarily connected. Therefore I will treat them as separate issues.
There is no requirement that each amino acid be selected one after the other
I have imagined a situation where each aminoacid is naturally selectable and can be individually fixed because it is by far the most favourable to the neo-darwinian scenario. I want to be very clear about that. I don't believe that complex functional are deconstructable into simpler functional, least of all naturally selectable, steps. This is specially true of complex regulatory structures, like the proteins we are considering here, but it is true of all complex enough functions. Such deconstructions are only (partially) reasonable in the tweaking of an existing function in a continuous functional space, but certainly not where a new biochemical or regulatory function must be built. However, as our imagination is potentially capable of it (even if we are not as well trained as a darwininst in conceiving just so stories). let's pretend that some complex function can be deconstructed into, say, 500 increasingly functional and naturally selectable one aminoacid transformation steps. I know, I know, it's impossible, but let's pretend. I say that this is by far the most favourable scenario for neo-darwinists. Why? OK, let's say that each one aminoacid mutation has ha probability, say, of 10e-4 to be observe in a given evolutionary time x. So, we need a time x+y (where y is the time to fixation, which can be very long, especially if the population is big) before we are in a favourable situation to wait for the second favourable mutation, which will need again approximately time x+y, and so on. So, let's say that, to get a set of, say, 4 favourable mutations, we need about a time of (x+y)x4. Now, what happens if the function is not deconstructable into one aminoacid selectable mutations (which indeed is impossible), but rather into 4 aminoacid steps of naturally selectable functional mutation? OK, it's impossible just the same, but let's pretend. To be more clear, we are now imagining that, to have a naturally selectable step, we need to change 4 aminoacids in a specific way, and not only one. Now, what happens? We can get in one single process what required four steps in the previous scenario. In that scenario, the required time for the four steps was: (x+y)*4, IOWs: 4x + 4y Now, it is "only" x1+y (there is only one time to fixation to be considered. That seems favourable, at a first impression, but... How much bigger is x1 (the time to get the mutation) if compared to x? If one aminoacid mutation with probability 10e-4 required approximately a time x, now the probability of getting 4 specific mutations in the same individual genome, each with 10e-4 probability, is the product of the 4 probabilities, IOWs 10e-16. Now, this number is much smaller than 10e-4, indeed it is 10e12times smaller (1000 billion times smaller). So, we can reasonably expect that x1, the time to get those 4 specific mutations, without any help from NS and any fixation, will be approximately 10e12 times longer than x. Not a trivial difference at all. So now the total time to get the same result that we got in time: 4x + 4y in the first scenario, now will be: 10e12 x + y Which is much less favourable than the first scenario, unless of course the time to fixation is really extremely long. So, the scenario where each single aminoacid substitution is naturally selectbale and is individually fixed (IOWs, where "each amino acid is selected one after the other", is by far the most favourable scenario for neo-darwinism. Increasing the size of the selectable steps only implies for the neo-darwinian theory bigger and bigger catastrophes. I hope that the above reasoning is mathematically correct. I will be happy to acknowledge possible errors or imprecisions if anyone will show me what they are. :) More in next post.gpuccio
July 23, 2017
July
07
Jul
23
23
2017
03:37 AM
3
03
37
AM
PDT
Dionisio: I still have something to say about wd400 statements. As soon as possible... :)gpuccio
July 22, 2017
July
07
Jul
22
22
2017
09:56 PM
9
09
56
PM
PDT
1 3 4 5 6 7 11

Leave a Reply