Bioinformatics tools used in my OPs: some basic information.

_{Giuseppe Puccio

November 14, 2017

Intelligent Design}

Share: Facebook; Twitter; LinkedIn; Flipboard; Print; Email

EugeneS made this simple request in the thread about Random Variation:

I also have a couple of very concrete and probably very simple questions regarding the bioinformatics algorithms and software you are using. Could you write a post on the bioinformatics basics, the metrics and a little more detail about how you produced those graphs, for the benefit of the general audience?

That’s a very reasonable request, and so I am trying here to address it. So, this OP is mainly intended as a reference, and not necessarily for discussion. However, I will be happy, of course, to answer any further requests for clarifications or details, or any criticism or debate.

My first clarification is that I work on proteins sequences. And I use essentially two important tools available on the web to all.

The first basic site is Uniprot.

The mission of the site is clearly stated in the home page:

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.

I would say: how beautiful to work with a site which, in its own mission, incorporates the concept of functional information! And believe me, it’s not an ID site! 🙂

Uniprot is a database of proteins. Here is a screenshot of the search page.

Here I searched for “ATP synthase beta”, and I found easily the human form of the beta chain:

Now, while the “Entry name”, “ATPB_human”, is a brief identifier of the protein in Uniprot, the really important ID is in the column “Entry”: “P06576”. Indeed, this is the ID that can be used as accession number in the BLAST software, that we will discuss later.

The “Reviewed” icon in the thord column is important too, because in general it’s better to use only reviewed sequences.

By clicking on the ID in the “Entry” column, we can open the page dedicated to that protein.

Here, we can find a lot of important information, first of all the “Function” section, which sums up what is known (or not known) about the protein function.

Another important section is the “Family and Domains” section, which gives information about domains in the protein. In this case, it just states:

Belongs to the ATPase alpha/beta chains family.

Then, the “Sequence” section gives the reference sequence for the protein:

It is often useful to have the sequence in FASTA format, which is probably the most commonly used format fro sequences. To do that, we can simply click on the FASTA button (above the Sequence section). This is the result:

This sequence is made of two parts: a comment line, which is a summary description of the sequence, and then the sequence itself. A sequence in this form can easily be pasted into BLAST, or other bioinformatics tools, either including the comment line, or just using the mere sequence.

Now, let’s go to the second important site: BLAST (Basic Local Alignment Search Tool). It’s a service of NCBI (National Center for Biotechnology Information).

We want to go to the Protein Blast page.

Now, let’s see how we can verify my repeated statement that the beta chain of ATP synthase is extremely conserved, from bacteria to humans. So, we past the ID from Uniprot (P06576) in the field “Accession number”, and we select Escherichia coli in the field “Organism” (important: the organism name must be selected from the drop menu, and must include the taxid number). IOWs, we are blasting the human proteins against all E. coli proteins. Here’s how it looks:

Now, we can click on the “BLAST” blue button (bottom left), and the query starts. It takes a little time. Here is the result:

In the upper part, we can see a line representing the 529 AAs which make the protein sequence, and the recognized domains in the sequence (only one in this case).

The red lines are the hits (red, because each of them is higher than 200 bits). When you see red lines, something is there.

Going down, we see a summary of the first 100 hits, in order of decreasing homology. We can see that the first hit is with a protein named “F0F1 ATP synthase subunit beta [Escherichia coli]“, and has a bitscore of 663 bits. However, there are more than 50 hits with a bitscore above 600 bits, all of them in E. coli (that was how our query was defined), and all of them with variants of the same proteins. Such a redundancy is common, especially with bacteria, and especially with E. coli, because there are a lot of sequences available, often of practically the same protein.

Now, if we click on the first hit, or just go down, we can find the corresponding alignment:

The “query” here is the human protein. You can see that the alignment involves AAs 59 – 523, corresponding tp AAs 2 – 460 of the “Subject”, that is the E. coli protein.

Th middle line represents the identities (aminoacid letter) and positives (+). The title reminds us that the hit is with a protein of E. coli whose name is “F0F1 ATP synthase subunit beta”, which is 460 AAs long (rather shorter than the human protein). It also gives us an ID/accession number for the protein, which is a different ID from Uniprot’s IDs, but can be used just the same for BLAST queries.

The important components of the result are:

Score: this is the bitscore, the number I use to measure functional information, provided that the homology is conserved for a long evolutionary time (usually, at least 200 – 400 million years). The bitscore is already adjusted for the properties of the scoring system.
The Expect number: it is simply the number of such homologies that we would expect to find for unrelated sequences (IOWs random homologies) in a similar search. This is not exactly a p value, but, as stated in the BLAST reference page: when E < 0.01, P-values and E-value are nearly identical.
Identities: just the number and percent of identical AAs in the two sequences, in the alignment. The percent is relative to the aligned part of the sequence, not to the total of its length.
Positives: the number and percent of identical + positive AAs in the alignment. Here is a clear explanation of what “positives” are:Similarity (aka Positives)When one amino acid is mutated to a similar residue such that the physiochemical properties are preserved, a conservative substitution is said to have occurred. For example, a change from arginine to lysine maintains the +1 positive charge. This is far more likely to be acceptable since the two residues are similar in property and won’t compromise the translated protein.Thus, percent similarity of two sequences is the sum of both identical and similar matches (residues that have undergone conservative substitution). Similarity measurements are dependent on the criteria of how two amino acid residues are to each other.(From: binf.snipcademy.com) IOWs, we could consider “positives” as “half identities”.
Gaps. This is the number of gaps used in the alignments. Our alignments are gapped, IOWs spaces are introduced to improve the alignment. The lower the number of gaps, the better the alignment. However, the bitscore already takes gaps in consideration, so we can usually not worry too much about them.

A very useful tool is the “Taxonomy report” (at the top of the page), which shows the hits in the various groups of organisms.

While in our example we looked only at E. coli, usually our search will include a wider range of organisms. If no organism is specified, BLAST will look for homologies in the whole protein database.

It is often useful to make queries in more groups of organisms, if necessary using the “exclude” option. For example, if I am interested in the transition to vertebrates for the SATB2 protein (ID = Q9UPW6, a protein that I have discussed in a previous OP), I can make a search in the whole metazoa group, excluding only vertebrates, as follows:

As you can see, there is very low homology before vertebrates:

And this is the taxonomy report:

The best hit is 158 bits in a spider.

Then, to see the difference in the first vertebrates, we can run a query of the same human protein on cartilaginous fish. Here is the result:

As you can see. now the best hit is 1197 bits. Quite a difference with the 158 bits best hit in pre-vertebrates.

Well, that’s what I call an information jump!

Now, my further step has been to gather the results of similar BLAST queries made for all human proteins. It is practically impossible to do that online, so I downloaded the BLAST executables and databases. That can be done from the BLAST site, and allows one to make queries locally on one’s own computer. The use of the BLAST executables is a little more complex, because it is made by command line instructions, but it is not extremely difficult.

To perform my queries, I downloaded from Uniprot a list of all reviewed human proteins: at the time I did that, the total number was 20171. Today, it is 20239. The number varies slightly because the database is constantly modified.

So, using the local BLAST executables and the BLAST databases, I performed multiple queries of all the human proteome against different groups of organisms, as detailed in my OP here:

The amazing level of engineering in the transition to the vertebrate proteome: a global analysis

This kind of query take some time, from a few hours to a few days.

I have then imported the results in Excel, generating a dataset where for each human protein (20171 rows) I have the value of protein ID, name and length, and best hit for each group of organism, including protein and organism name, protein length, bitscore value, expect value, number and percent of identities and positives, gaps. IOWs, all the results that we have when we perform a single query on the website, limited to the best hit.

In the Excel dataset I have then computed some derived variables:

the bitscore per AA site (baa: total bitscore / human protein length)
the information jump in bits for specific groups of organisms, in particolar between cartilaginous fish and pre-vertebrates (bitscore for cartilaginous fish – bitscore for non vertebrate deuteronomia)
the information jump in bits per aminoacid site for specific groups of organisms, in particolar between cartilaginous fish and pre-vertebrates (baa for cartilaginous fish – baa for non vertebrate deuteronomia)

The Excel data are then imported in R. R is a wonderful open source statistical software and programming language that is constantly developed and expanded by statisticians all over the world. It also allows to create very good graphs, like the following:

This is a kind of graph for which I have written the code and which, using the above mentioned dataset, can easily plot the evolutionary history, from cnidaria to mammals, of any human protein, or group of protein, using their IDs. This graph uses the bit per aminoacid values, and therefore sequences of different length can easily be compared. A refernce line is always plotted, with the mean baa value in each group of organism for all human proteins. That already allows to visualize how the pre-vertebrate-vertebrate transition exhibitis the greatest informational jump, in terms of human conserved information.

However, the plots regarding individual proteins are much more interesting, and they reveal huge differences in the their individual histories. In the above graph, for example, I have plotted the histories of two completely different proteins:

The green line is protein Cdc42, “a small GTPase of the Rho family, which regulates signaling pathways that control diverse cellular functions including cell morphology, cell migration, endocytosis and cell cycle progression” (Wikipedia). It is 191 AAs long, and, as can be seen in the graph, it is extremely conserved in all metazoa, presented almost maximal homology with the human form already in Cnidaria.
The brown line is our well known SATB2 (see above), a 733 AAs protein which is, among other things, “required for the initiation of the upper-layer neurons (UL1) specific genetic program and for the inactivation of deep-layer neurons (DL) and UL2 specific genes, probably by modulating BCL11B expression. Repressor of Ctip2 and regulatory determinant of corticocortical connections in the developing cerebral cortex.” (Uniprot) In the graph we can admire its really astounding information jump in vertebrates.

This kind of graph is very good to visualize the behaviour of proteins and of groups of proteins. For example, the following graph is of proteins involved in neuronal adhesion:

It shows, as expected, a big information jump, mainly in cartilaginous fish.

The following, instead, is of odorant receptors:

Here, for example, the jump is later, in bony fish, and it goes on in amphibian and reptiles, and up to mammals.

Well, I think that I have given at least a general idea of the main issues and procedures. If anyone has specific requests, I am ready to answer.

Comments

daveS: Maybe some details in my OPs are very technical. I can understand that. But I think that my two questions, my challenge, are not technical at all. We all have to do with complex functions: in language, in software, in all kinds of machines. Wondering if they can be, as a rule, deconstructed into simpler steps does not seem so far fetched. And statements about proteins and their functional space are made daily by our friends on the other side, most of them without any foundation. How many of them have invoked Keefe and Szostak as the final evidence that functional proteins are abundant on the market? How many of them have invoked Wagner's n dimensional cubes or what else, without even trying to explain what they meant? How many of them have invested all they personal credibility in the reality of the RNA world? Complex issues, indeed. Nobody seem to fear their complexity. My two answers are not so complex. We have a protein with hundreds of conserved functional residues. Do they really believe that there are hundreds of gradual 1 AA steps to that sequence? Do they really believe that each of those steps confers a reproductive advantage, and is therefore naturally selectable? It seems that they do believe that. I am only asking: why? Please, give me your reasons. I am not asking to be convinced. Just to know if they have reasons, reasons that can be expressed and explained.gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
09:56 AM
9
09
56
AM
PDT}

Mung: Quod erat demonstrandum! :) And my cats strolling on my keyboard are a serious problem indeed. :)gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
09:42 AM
9
09
42
AM
PDT}

gpuccio,
However, a number of critics seem to be ready to debate the most disparate arguments, ranging from philosophy to religion to politics to morals to phisics to quantum mechanics to probability and so on. And, of course, anything about biology, even when they seem not to understand the basics of it.
Perhaps true, but quite a few of those subjects are accessible to the layman (at least at a superficial level). Most of us have had some exposure to religion and politics, for example. On the other hand, I suspect that if you posed a problem involving deriving the position wave equation for a particle in a box, you would also get relatively few responses.
So, such a complete silence is not really understandable, from people who are so certain of what they believe, and from a whole world that is so certain that that belief is absolute truth.
I wonder if your interlocutors actually feel so certain about their beliefs. I don't feel very confident in many of my beliefs, and in fact am sometimes surprised at how readily some here can answer what I consider to be very difficult questions.daveS_{November 20, 2017
November
11
Nov
20
20
2017
07:36 AM
7
07
36
AM
PDT}

Mung can probably go on. He does it so naturally and graciously!
All of you remind me of a bunch of monkeys banging away at typewriters. My cat produces more interesting content when he sits on my keyboard. You guys stumble across a few sites on the interweb and it's like you've discovered the fountain of truth. Frankly, it's embarrassing. Let me know when you understand the science. Then perhaps we can have an intelligible conversation. Until then ...Mung_{November 20, 2017
November
11
Nov
20
20
2017
07:17 AM
7
07
17
AM
PDT}

daveS: Of course, it is possible. However, a number of critics seem to be ready to debate the most disparate arguments, ranging from philosophy to religion to politics to morals to phisics to quantum mechanics to probability and so on. And, of course, anything about biology, even when they seem not to understand the basics of it. My two questions are about a very simple and fundamental requirement for the neo-darwinian algorithm to work: that complex functions should be, as a rule, deconstructable into simple naturally selectable steps. Without that, all the neo-darwinian theory has no foundation. I have simply asked if somebody can give reasons, any reason, why we should believe such a thing to be true. Either logical reasons, or empirical reasons. So, such a complete silence is not really understandable, from people who are so certain of what they believe, and from a whole world that is so certain that that belief is absolute truth. :)gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
07:13 AM
7
07
13
AM
PDT}

gpuccio,
GPuccio: Yes, certainly I am not happy that no serious discussion about real data can be achieved with our interlocutors. That nobody has even tried to answer my challenge is the most disappointing fact of all.
This is a fairly small community; is it possible that there are no ID critics here with the background necessary to address your challenge? Or perhaps those who do have the background do not have the time or interest at the moment?daveS_{November 20, 2017
November
11
Nov
20
20
2017
04:46 AM
4
04
46
AM
PDT}

GPuccio: Yes, certainly I am not happy that no serious discussion about real data can be achieved with our interlocutors. That nobody has even tried to answer my challenge is the most disappointing fact of all.
To be frank, I am disgusted. Very angry.Origenes_{November 20, 2017
November
11
Nov
20
20
2017
04:35 AM
4
04
35
AM
PDT}

Dionisio at #27: "Where is this LUCA positioned relative to bacteria, prokaryotes, etc.? Before, in between, after?" The idea is that it is the common ancestor of both bacteria and archaea (and therefore, of all other living things). So, those domains or structures that are common to the two would have likely already been present in LUCA. LECA would be, instead, the last eukaryotic common ancestor. This entity, however, is much more elusive, given the many uncertainties about eukaryotic appearance. Of course, neo-darwinists believe that there was a FUCA (first universal common ancestor), and that OOL proceeded in some way from FUCA to LUCA. But there is no real evidence that a FUCA ever existed as separated from LUCA. Only fairy tales. Instead, LUCA at least can be defined in some way from empirical observations (the information common to bacteria and archaea), and therefore has some scientific status, IMO, with all the necessary cautions.gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
03:51 AM
3
03
51
AM
PDT}

Origenes: Yes, the amazing examples of thousands of bits in protein information jumps have probably intoxicated us, but we must not forget that even a couple of hundreds of bits is an amazing result. Moreover, these examples of ultra-conserved information in non coding sequences, like introns, is specially fascinating. However, I think that non coding DNA can be functional even when no relevant conservation is observed. In principle, there are two possible explanations for that: a) The function is extremely specific to the species, and therefore is scarcely conserved. b) The relationship between function and sequence is different than in proteins, and allows much greater sequence variation. Both things are possible, and they are not mutually exclusive. But we still know too little. Another fascinating aspect of non coding DNA is that its function can well be non local. Transcribed DNA can work in a multitude of different regulatory ways, and it can influence processes that are apparently unrelated to the location where the DNA sequence is found. Moreover, complex variations in DNA structure, still little understood, can generate important interactions between very distant DNA sites. How does Dionisio say? "Complex functionally specified informational complexity." I suppose it's an understatement! :)gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
03:44 AM
3
03
44
AM
PDT}

Dionisio: Yes, certainly I am not happy that no serious discussion about real data can be achieved with our interlocutors. That nobody has even tried to answer my challenge is the most disappointing fact of all. How can they deny that deconstruction of complex functions is a basic requirement for their theory to be acceptable? How can they not even try to give arguments in its support? If Corey is all that is left on the opponent side, the situation is really sad! :) However, be happy! I officially relieve you from having to pretend to be my opponent! Mung can probably go on. He does it so naturally and graciously! :)gpuccio_{November 20, 2017
November
11
Nov
20
20
2017
03:31 AM
3
03
31
AM
PDT}

GPuccio @50 I forgot to mention that I had included metazoa, which explains why I did not get that plant. Scusa! - - - - - Browsing through the introns base one notices that the scores among vertebrate species are often rather irregular. Also, often we find part of the sequence in pre-vertebrates, but ever so often there is no trace. For instance the ultra conserved intron IRXA_Ruby. Very conserved in vertebrates. When I blasted it, the lowest score is 351 in bony fishes. And then “No significant similarity found”. That is 351 bits of functional information out of nowhere. Surely, it is not as spectacular as the 1000s of unaccounted bits wrt protein sequences, but, still, it is not what unguided-evolution-fans want to hear. :)Origenes_{November 20, 2017
November
11
Nov
20
20
2017
03:25 AM
3
03
25
AM
PDT}

Ok, enough pretending being an opponent. I mistakenly thought that my stupid comments @43 & @47 could encourage some of gpuccio's polite dissenters -so conspicuously absent from his threads- to jump into this discussion, but now I realized that they won't, simply because they lack what is required for serious technical discussions: valid arguments and desire to find the truth. It is difficult for me to imitate writing so much nonsense. Perhaps it's easier when they believe that it's true? If gpuccio wants to have politely dissenting interlocutors in his discussion threads, he would have to stop making references to theoretical and empirical evidences that scare the potential dissenters away, because they seem to prefer pseudo-philosophical speculative gossiping, not detailed technical scientific discussions with so much real data leaving no room for troll hogwash. I prefer the technical discussions, even though they are more difficult for me to understand well. If that keeps the dissenters away, so be it. Still gpuccio's threads attracted a relatively large number of anonymous readers.Dionisio_{November 20, 2017
November
11
Nov
20
20
2017
02:43 AM
2
02
43
AM
PDT}

Very insightful discussion between Origenes and gpuccio.Dionisio_{November 19, 2017
November
11
Nov
19
19
2017
11:42 PM
11
11
42
PM
PDT}

Origenes: That's what I did, but I got the: "No significant similarity found" result. However, maybe I understand the reason. I blasted with the default option (megablast), which is optimized for higher similarities. I repeated the query choosing blastn, which is more sensitive, and I got a few hits, the best of them being 51.8 bits in Populus trichocarpa (a plant). Even in cartilaginous fish, the best hit was 41 bits. So, it really seems that the ultra conserved nucleotide sequence appears in bony fish.gpuccio_{November 19, 2017
November
11
Nov
19
19
2017
05:20 PM
5
05
20
PM
PDT}

GPuccio @48 Good to hear that you find it interesting.
GPuccio: How did you blast the nucleotide sequence in pre-vertebrates? I have tried, and could not get any hit before bony fish. Not even your 46 bits.
(1) 'Apollo' Fasta sequence from this website:
GTTCTGTTTATCATTCTAATCCATGTTTTGCAATTTATCTACTCCCTGTT AATATTATAGGCGATTTTTTACTGTGGCTGTGACAGAAGCTGCCGATTTA GCTCTTTCACCTACTGATAAATAAACAATGCACAGATCTGACCTTTAGGT TAACAGGTTTTATGCTTGCTCCACTCAGCACTCTAACTGATTCAATTATC ATAAAGGTTCAGGAGGCTCCATGAATACTGAAAAAGGCCCACCATATGCC TGCATAGGTGTTGTGGAACAGCAAATATTCTGCAGCCCTCCAGAGAAATT CCTTAATTGTAAATAATTTCACCATGCGACACAATCAAGTCACCTTGAAT GCAAACCCCTCAGCCTGCGGGGGCAAAGTGTTATTTAAGCTTTACTGGGC TGCGTTAAATTCTGCAATTTGAAGGGCTGTTAAGTTTTTCAATTGAAATT TCATTTAAAATGCAGGTGCTTTTTATTATATTGAGGCTTTACTGCTCTCT AGGTACAAGCAAGAACATGGTGCAATAACACAAATCTGGCTCAATCACTG ATCAGTAACAGCTGTAATTCCAGAACATTTAGCATTCTTATAAACCACGG CCTGAAATCTATAAATTGCTAAAACAGATCAAGAAAATACTGTATCCCCC CTTTTCTGCCCACAGCAATTTTGACATTTATGAGATTTTTCTGTGAACAT TAGATTTTATTGAAAATCTTTAAAAAAGATATACTTGGATTTAGTAATTG TTTAAAA
(2) Go to nucleotide blast (3) Insert fasta code & exclude vertebrates. - - - - - - - BTW here are UCNs listed. They all have a short blast summary.Origenes_{November 19, 2017
November
11
Nov
19
19
2017
04:09 PM
4
04
09
PM
PDT}

Origenes: Very interesting site, thank you for pointing to it. You are becoming better and better! :) Indeed, zebrafish gives 528 bits total, in two non overlapping aligments. This interestin intronic sequence seems to be located between exon 40 and 41, nearer to exon 41, in the Usherin gene. Usherin (O75445) is a 5202 AAs long protein, whose gene is divided into 72 exons. The protein has a rather standard evolutionary history, similar to the mean in metazoa. Its function in Uniprot is rather briefly described as "Involved in hearing and vision". How did you blast the nucleotide sequence in pre-vertebrates? I have tried, and could not get any hit before bony fish. Not even your 46 bits. My experience with blasting nucleotides is very limited, but it seems that this ultra-conserved intronic sequence exhibits a more distinct jump in vertebrates (apparently in bony fish) than the protein itself where it is located (at least in terms of conservation density).gpuccio_{November 19, 2017
November
11
Nov
19
19
2017
03:17 PM
3
03
17
PM
PDT}

What's all that excitement about big jumps? Big jumps have been recorded in nature before: https://www.youtube.com/embed/QUdVteq8XBs :)Dionisio_{November 19, 2017
November
11
Nov
19
19
2017
02:11 PM
2
02
11
PM
PDT}

GPuccio @40 Thank you for that link. It sparked my interest in introns — non-coding sections of DNA by some considered to be junk-DNA. However there are "UCRs" — ultraconserved regions, which are regions over 200 bp in length with 100% identity across species.
WIKI: "It is still not fully understood why the negative selective pressure on these regions is so much stronger than the selection in protein-coding regions."
One thing is for sure: conservation --> functional information. So, I thought, why not blast an intron sequence ? My choice is intron "ESRRG_Apollo" with a length of 757 nt. Then I found ccg.vital-it.ch/UCNEbase/ a website solely dedicated to UCRs.
"UCNEbase provides information on the evolution and genomic organization of ultra-conserved non-coding elements (UCNEs) in multiple vertebrate species. It currently covers 4351 such elements in 18 different species."
They did the work for me. On this page we see conservation from mouse to frog (bit-score 419.4). Next we see the zebrafish Bit-score: 255.9. And that is it. I blasted the nucleotide sequence and excluded vertebrates, there was no score higher than 46. This might be interesting.Origenes_{November 19, 2017
November
11
Nov
19
19
2017
01:56 PM
1
01
56
PM
PDT}

"Lie" is such an ugly word... :)jstanley01_{November 19, 2017
November
11
Nov
19
19
2017
12:18 PM
12
12
18
PM
PDT}

What we have here is merely the appearance of information jumps. Nature does not make jumps. If any jumps are required then my theory would absolutely break down and I would give nothing for it.Mung_{November 19, 2017
November
11
Nov
19
19
2017
10:21 AM
10
10
21
AM
PDT}

Mung is correct. gpuccio is making up all those values so they look favorable to his ideas which nobody else shares because they lack theoretical and empirical confirmation. :)Dionisio_{November 19, 2017
November
11
Nov
19
19
2017
09:02 AM
9
09
02
AM
PDT}

Mung: Now, who is my best adversary? You or Dionisio? :)gpuccio_{November 19, 2017
November
11
Nov
19
19
2017
08:18 AM
8
08
18
AM
PDT}

I don't know why anyone listens to gpuccio. The Noble Prize for fraudulent internet blogging is not an award to be respected. And anyone could write a database that makes it appear as if new proteins pop into existence out of nothing. That doesn't prove anything. And all this talk about information jumps? Information is continuous, not discrete, it doesn't make jumps. The jumps are only apparent jumps, they are not real jumps, they are just an artifact of our models. I could go on for days but I doubt that it would have any impact on the cult of gpuccio here. BLAST ON!Mung_{November 19, 2017
November
11
Nov
19
19
2017
08:15 AM
8
08
15
AM
PDT}

Origenes and Dioniso: Bt the way, there is a very interesting ID page about the spliceosome, here, by Jonathan M. : https://evolutionnews.org/2013/09/the_spliceosome_1/ He has perfectly caught the importance of this unique molecular machine from an ID point of view. He also mentions the amazing PRP8 component! :)gpuccio_{November 19, 2017
November
11
Nov
19
19
2017
04:41 AM
4
04
41
AM
PDT}

gpuccio, cuff links might do the trick :)Dionisio_{November 18, 2017
November
11
Nov
18
18
2017
03:20 PM
3
03
20
PM
PDT}

Dionisio: A third way to deconstruction? Sounds great! :)gpuccio_{November 18, 2017
November
11
Nov
18
18
2017
01:36 PM
1
01
36
PM
PDT}

Maybe the 'third way' folks are going to include gpuccio's functional deconstruction challenge into their things to do list? :)Dionisio_{November 18, 2017
November
11
Nov
18
18
2017
01:31 PM
1
01
31
PM
PDT}

Origenes: Well, the can always try to deconstruct the function into about 950 naturally selectable steps of 1 AA! :) Ah, I forgot that they have never taken my challenge... May I paste it again here? One never knows.
Will anyone on the other side answer the following two simple questions? 1) Is there any conceptual reason why we should believe that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases? 2) Is there any evidence from facts that supports the hypothesis that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?
And let's remember that the function that should be deconstructed, here, is the whole spliceosome, an amazingly huge molecular machine. Made of... Something like about 140 different proteins + 5 specific RNAs? And that we have already analyzed two of those proteins (OK, two of the biggest!): snRNP, which requires at least 2000 bits of new functional information in eukaryotes PRPF8, which requires at least 3900 bits of new functional information in eukaryotes and that those two sequences are part of the U5 component, which is part of the U6-U4-U5 component, which is part of the whole cycle of the spliceosome? Anyone is thinking of possible irreducible complexity here? Let's see. This is from an article of 2002: "Comprehensive proteomic analysis of the human spliceosome" https://www.nature.com/articles/nature01031
Using nanoscale microcapillary liquid chromatography tandem mass spectrometry, we identify 145 distinct spliceosomal proteins, making the spliceosome the most complex cellular machine so far characterized. Our spliceosomes comprise all previously known splicing factors and 58 newly identified components. The spliceosome contains at least 30 proteins with known or putative roles in gene expression steps other than splicing. This complexity may be required not only for splicing multi-intronic metazoan pre-messenger RNAs, but also for mediating the extensive coupling between splicing and other steps in gene expression.
(emphasis mine) Irreducible complexity? But... wait: the paper says that: "The spliceosome contains at least 30 proteins with known or putative roles in gene expression steps other than splicing." (emphasis mine) So, what's the problem? We are in Ken Miller's tie clip ideological space! No problem at all. Who cares if 115 (145 - 30) proteins with no "known or putative role" elsewhere have also been found in the machine? Neo-darwinism is safe! :) OK, maybe Ken will have to use something better than a tie clip, this time. What about cuff links? :)gpuccio_{November 18, 2017
November
11
Nov
18
18
2017
08:39 AM
8
08
39
AM
PDT}

// follow-up #34 // The link to GPuccio's article: https://uncommondescent.com/intelligent-design/what-are-the-limits-of-random-variation-a-simple-evaluation-of-the-probabilistic-resources-of-our-biological-world/Origenes_{November 18, 2017
November
11
Nov
18
18
2017
08:07 AM
8
08
07
AM
PDT}

// follow up #33 // In order to add some more perspective to the 3900 bits of information, wrt the SNRNP200 protein, which is in need of an explanation. From this recent article by GPuccio:
... any sequence with 160 bits of functional information is, by far, beyond any reasonable probability of being the result of RV in the system of all bacteria in 4 billion years of natural history, even with the most optimistic assumptions.
However, the reader may object that bacteria are not involved in finding this particular 3900 bits of information. And the reader would be correct, since the sequence is non-existent in bacteria. But, surely, this doesn't help the hypothesis of unguided evolution.Origenes_{November 18, 2017
November
11
Nov
18
18
2017
07:46 AM
7
07
46
AM
PDT}

Prev 1 2 3 4 5 Next

You must be logged in to post a comment.

Leave a Reply