Uncommon Descent Serving The Intelligent Design Community

Is functional information in DNA always conserved? (Part two)

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

So, in the  first  part of this discussion, I have tried to show with real data from scientific literature how much of the human genome is conserved, and how that conservation is evaluated and expressed. Then I have argued that we already have good credible evidence for function in a relevant part of the human genome (let’s say about 20%), and that most of that functional part is non coding, and great part of it is non conserved. While some can disagree on the real figures, I think that it is really difficult to reject the whole argument.

But, as I have anticipated, there are two more important aspects of the issue that I want to discuss ion detail. I will do it now.

3) Conserved function which does not imply conserved sequence.

The reason why sequence is conserved when function is present is because function creates specific constraints to the sequence itself.

For example, in a protein sequence with a well defined biochemical function, some variation will be possible without affecting the protein function,  while other kinds of variation will affect it more or less.

We have many examples of important loss of function for the change of even one aminoacid:  mendelian diseases in humans are a well known, unpleasant example of that.

We have many examples of important variation in the sequence of functional proteins which does not affect the function:  the so called neutral variations in proteins. For example, there are many variants of human hemoglobin, more than 1000, most of them caused by a single aminoacid substitution. While many of them cause some disease, or at least some functional modification of the protein, at least a few of them are completely silent clinically, both in the heterozygote and in the homozygote state.

Now, there is an important consequence of that. Neutral variation happens also in functional sequences, although it happens less in those sequences. How much neutral variation can be tolerated by a functional sequnece depends on the sequence. For proteins, it is well known that some of them can vary a lot while retaining the same structure and function, while others are much more functionally constrained. Therefore, even functional proteins are more or less conserved, in the same span of time.

What about non coding genes? While we  understand much (but not all) of the sequence-structure-function relationship for proteins, here we are almost wholly ignorant. Non coding genes, when they are functional, act in very different ways, most of them not well understood. Many of them are transcribed, and we don’t understand much of the structure of the transcribed RNAs, least of all of their sequence-structure-function relationship.  IOWs, we have no idea of how functionally constrained is the sequence of a functional non coding DNA element.

While searching for pertinent literature about this issue, I have found this very recent, interesting paper:

Evolutionary conservation of long non-coding RNAs; sequence, structure, function.

The abstract (all emphasis is mine):

BACKGROUND:

Recent advances in genomewide studies have revealed the abundance of long non-coding RNAs (lncRNAs) in mammalian transcriptomes. The ENCODE Consortium has elucidated the prevalence of human lncRNA genes, which are as numerous as protein-coding genes. Surprisingly, many lncRNAs do not show the same pattern of high interspecies conservation as protein-coding genes. The absence of functional studies and the frequent lack of sequence conservation therefore make functional interpretation of these newly discovered transcripts challenging. Many investigators have suggested the presence and importance of secondary structural elements within lncRNAs, but mammalian lncRNA secondary structure remains poorly understood. It is intriguing to speculate that in this group of genes, RNA secondary structures might be preserved throughout evolution and that this might explain the lack of sequence conservation among many lncRNAs.

SCOPE OF REVIEW:

Here, we review the extent of interspecies conservation among different lncRNAs, with a focus on a subset of lncRNAs that have been functionally investigated. The function of lncRNAs is widespread and we investigate whether different forms of functionalities may beconserved.

MAJOR CONCLUSIONS:

Lack of conservation does not imbue a lack of function. We highlight several examples of lncRNAs where RNA structure appears to be the main functional unit and evolutionary constraint. We survey existing genomewide studies of mammalian lncRNA conservation and summarize their limitations. We further review specific human lncRNAs which lack evolutionary conservation beyond primates but have proven to be both functional and therapeutically relevant.

GENERAL SIGNIFICANCE:

Pioneering studies highlight a role in lncRNAs for secondary structures, and possibly the presence of functional “modules”, which are interspersed with longer and less conserved stretches of nucleotide sequences. Taken together, high-throughput analysis of conservation and functional composition of the still-mysterious lncRNA genes is only now becoming feasible.

 

So, what are we talking here? The point is simple. Function in non coding DNA can be linked to specific structures in RNA transcripts, and those structures, and therefore their function, can be conserved across species even in absence of sequence conservation. Why? Because the sequence/structure/function relationship in this kind of molecules is completely different from what we observe in proteins, and we still understand very little of those issues.

As the authors say:

In contrast to microRNAs, almost all of which are post-transcriptional repressors, the diverse functions of lncRNAs include both positive and negative regulations of protein-coding genes, and range fromlncRNA:RNA and lncRNA:
protein to lncRNA:chromatin interactions [8–11]. Due to this functional diversity, it seems reasonable to presume that different evolutionary constraints might be operative for different RNAs, such as mRNAs, microRNAs, and lncRNAs.

Which is exactly my point.

The authors examine a few cases where the sequence/structure/functional relationship of some lncRNAs has been stiudied more in detail.  They conclude:

Tens of thousands of human lncRNAs have been identified during the first genomic decade. Functional studies for most of these lncRNAs are however still lackingwith only a handful having been characterized in detail [8,10,11,87]. Fromthese few studies it is apparent that some lncRNAs are important cellular effectors ranging from splice complex formation [34] to chromatin and chromosomal complex formation [43,46] to epigenetic regulators of key cellular genes.

It is becoming increasingly apparent that lncRNAs do not show the same pattern of evolutionary conservation as protein-coding genes. Many lncRNAs have been shown to be evolutionary conserved [5]; but they do not appear to exhibit the same evolutionary constraints as mRNAs of protein-coding genes.

While certain regions of the lncRNAs appear tomaintain the regulatory function, such as bulges and loops, the exact sequence in other regions of lncRNAs appear less important and possibly act as spacers in order to link functional units or modules. Depending on the function, e.g.,whether the RNA sequence is a linker or a functional module, different patterns of conservation might be expected.

It is important to remember that lncRNA genes are only a part of non coding DNA. If someone wonders how big a part, I would suggest the following paper:

The Vast, Conserved Mammalian lincRNome

which estimates human lncRNA genes at about 53,649 genes, more than twice the number of protein coding genes, corresponding to about 2.7% of the whole genome (Figure 2). It’s an important part, but only a part. And it is a part which, while probably functional in many cases, still is poorly conserved at sequence level.

Other parts of the non coding genome will have different types of function, structure, and therefore sequence conservation. For example, the following paper:

Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites

argues that most conserved non coding regions (about 3.5% of the genome, conserved across vertebrate phylogeny, strongly suggesting its functional importance, which clusters into >700 000 unannotated conserved islands, 90% of which are <200 bp) “serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers”, rather than encoding non-coding RNAs. IOWs, these short sequences in the non coding genome which make up another 3.5% of the total would be functional not because of their RNA transcript, but directly as binding sites (enhancers and other distal regulatory elements). Now, these sequences are conserved. That proves the general point: different functions, different relationship between sequence and function, different conservation of functional elements. In general, it seems that function which expresses itself through non coding RNA transcripts is less conserved at sequence level.

And now, the last point, maybe the most important of all.

4) Function which requires non conservation of sequence.

When we analyze conservation of sequences across species as an indicator of function, we are forgetting a fundamental point: in the course of natural history, species change, and function changes with them.

IOWs, the reason why species are different is that they have different molecular functions.

So, there is some implicit contradiction in equating conservation with function. A conserved sequence is very likely to be functional, but it is not true that a function needs a conserved sequence, if it is a new function, or a function which has changed.

Now. we know that protein coding genes have not changed a lot in the last parts of natural history. It is usually recognized that the greatest change, especially in more recent taxa, is probably regulatory. And the functions which have been identified in various parts of non coding DNA are exactly that: regulatory.

So, to sum up:

– Species evolve and change

– The main tool for that change is, realistically, a change in regulatory functions

– If a function changes, the sequences on which the function is based must change too

– Therefore, those important regulatory functions which change for functional reasons will not be conserved across species

This point is different from the previous point discussed here.

In point 3, the reasoning was that the same function can be conserved even if the sequence changes, provided that the structure is conserved.

In point 4, we are saying that in many cases the sequence must change for the function to change with it.

Now, although this reasoning is quite logic and convincing, I will try to backup it with empirical observations. To that purpose, I will use two different models: HARs and the results of the recent FANTOM5 paper about the promoterome.

4a) Human Accelerated Regions (HARs).

Waht are HARs? Let’s take it from Wikipedia:

Human accelerated regions (HARs), first described in August 2006,  are a set of 49 segments of the human genome that are conserved throughout vertebrate evolution but are strikingly different in humans.

IOWs, they are sequences which were conserved in primates, and which change in humans.

Are they functional. That’s what is believed for some of them. Again, Wikipedia:

Several of the HARs encompass genes known to produce proteins important in neurodevelopment. HAR1 is an 106-base pair stretch found on the long arm of chromosome 20 overlapping with part of the RNA genes HAR1F and HAR1R. HAR1F is active in the developing human brain. The HAR1 sequence is found (and conserved) in chickens and chimpanzees but is not present in fish or frogs that have been studied. There are 18 base pair mutations different between humans and chimpanzees, far more than expected by its history of conservation.[1]

HAR2 includes HACNS1 a gene enhancer “that may have contributed to the evolution of the uniquely opposable human thumb, and possibly also modifications in the ankle or foot that allow humans to walk on two legs”. Evidence to date shows that of the 110,000 gene enhancer sequences identified in the human genome, HACNS1 has undergone the most change during the evolution of humans following the split with the ancestors of chimpanzees.[4] The substitutions in HAR2 may have resulted in loss of binding sites for a repressor, possibly due to biased gene conversion

Now, for brevity, I will not go into details, but…  “active in the developing human brain” and “may have contributed to the evolution of the uniquely opposable human thumb, and possibly also modifications in the ankle or foot that allow humans to walk on two legs” are provocative thoughts enough, and I believe that I don’t need to comment on them.

The important point is: what makes us humans different from chimps? Logic says: something which is different. Not something which is conserved.

4b) The results from FANTOM5 about the promoterome.

FANTOM5 has very recently published a series of papers with very important results. One the most important is probably the following article on Nature:

A promoter-level mammalian expression atlas

Unfortunately, the article is paywalled. I have access to it, so I will try to sum up the points which are needed for my reasoning.

So, what did they do? In brief, they used a very powerful technology, cap analysis of gene expression (CAGE), to study various aspects of the transcriptome in different human cells from different tissues and states. This is probably the most important analysis of the human transcriptome ever realized.

This particular paper focuses on a “promoter atlas”, IOWs an atlas of the expression of promoters (transcription start sites, TSSs, which control the transcription of target genes) in different tissues.

So, according to the level of expression of those promoters in different tissues and cells, they classify genes (both protein coding and non protein coding) in:

– ubiquitous-uniform (‘housekeeping’, 6%): those genes which are expressed at similar levels in most cell types

– ubiquitous non-uniform (14%): expressed in most cell types, but at different levels

– non-ubiquitous (cell-type restricted, 80%)

Each of those types includes both  C (protein coding genes) and N (non protein coding genes).

Now. that’s very interesting. Now we know that most genes (80%), both coding and non coding, are expressed only in some cell types.

But the most interesting thing, for our discussion about conservation, is that they studied the promoter expression both in human cells and in other mammals.

Now, we must look at Figure 3 in the paper. For those who cannot access the article, there is a low resolution version of this figure here  (just click on Figure 3 in the “at a glance” box;  OK, OK, it’s better than nothing!).

The figure is divided into two parts, a and b. In each part, the x axis shows the evolutionary divergence from humans (from 0 to 0.8, the grey vertical lines correspond to macaque, dog and mouse). The y axis shows “Human TSS with aligning orthologous sequence (%)”, IOWs the % conservation of each group of genes in the graph at various points of evolutionary divergence. Each line represents a different group of genes. So, the lines which remain more “horizontal” represent groups of genes which are more conserved, while those which “go down” from lest to right are those less conserved.  I hope it’s clear.

On the left (part a) genes are grouped as above: ubiquitous- uniform, etc, each category divided into C or N (coding or non coding).

What are the conserved groups? In order:  Non-ubiquitous C (green line); Ubiquitous uniform C (orange line); Ubiquitous non-uniform C (purple line).

IOWs, coding genes are more conserved, and non ubiquitous are most conserved.

That is not news.

Conversely, non coding genes are less conserved, in this order: Non-ubiquitous N (lighter green); Ubiquitous non-uniform N (lighter purple); Ubiquitous uniform N (lighter orange). This last line is definitely less conserved than the random reference (the dotted line).

This part is “Conservation by expression breadth and annotation”.

Well, what is on the right (part b)? It is “Conservation by cell-type biased expression”.

IOWs, the graph is the same, but genes are grouped in different lines according to the cell type where they are preferentially expressed.

The most conserved groups? Those with preferential expression in:  Fibroblast of periodontium, Fibroblast of gingiva, Preadipocyte, Chondrocyte, Mesenchymal cell.

The least conserved? Those with preferential expression in:  Astrocyte, Hepatocyte, Neuron, Sensory epithelial cell, Macrophage, T-cell, Blood vessel endothelial cell. In decreasing conservation order.

Does that mean something?  I leave it to you to decide. For me, I definitely see a pattern. With all due respect for fibroblasts and adipocytes, neurons and T cells smell more of specialized cells which must change in higher taxa (excuse me, Piotr, mice will accuse me of not being politically correct).

So, my humble suggestion is: the things that change more are not necessarily those less functional. In many cases, they could be exactly the opposite: the bearers of new, more complex functions.

And non coding genes are very good candidates for that role.

Comments
Gpuccio:
Your strange “summary” does not even appear to be a comment, and frankly I can’t see any resemblance to what I have tried to argue here in it.
OK, I refer to statements like these:
IOWs, the current theory that most variation is neutral and that most of human genome is non functional is simply wrong and obstinately ignores the problem of where the procedures are written...
I don’t think that RNA’s function is really sequence independent, but that we don’t know how much and how it is sequence dependent.
That means that there can be an intricate regulation network that we not only don’t understand, but essentially don’t see.
The picture that emerges is one of a genome full of functional "dark matter" -- an intricate regulatory network, essentially invisible and incomprehensible to us, but presumably involved in some important activities like cortical development. Let's imagine that such a network (spanning most of the genome) is real, and let's call it the imaginome ®. What about mutational load? The imaginome would be an easy target for mutations, and one would expect many of them to have deleterious effects. I shall have more questions later on.Piotr
May 23, 2014
May
05
May
23
23
2014
01:14 PM
1
01
14
PM
PDT
Will I still don't understand it Joe, sounds a bit like magic to me.wd400
May 23, 2014
May
05
May
23
23
2014
12:30 PM
12
12
30
PM
PDT
wd400- the information is stored on the physical space available. The function is to serve as a physical space to store immaterial information.Joe
May 23, 2014
May
05
May
23
23
2014
12:02 PM
12
12
02
PM
PDT
wd400- In a design scenario there would be at least two different types or classes of information. The first is the Crick class- functional sequence specificity in both nucleotides and proteins. That is the material representation of the immaterial information. The second class would be what directs all of that- as in it just doesn't happen due to physics and chemistry. That class is immaterial.Joe
May 23, 2014
May
05
May
23
23
2014
11:58 AM
11
11
58
AM
PDT
Joe, . You can’t see the immaterial information, ie software How is this immaterial information stored if not in sequences? The nucleotides = available/ blank disc space or RAM- It ain’t the information, it’s just a place for the information to reside. So there function is to one day serve a function? You still have mutational load from rearrangements and indels to deal with, I'm afraidwd400
May 23, 2014
May
05
May
23
23
2014
11:56 AM
11
11
56
AM
PDT
wd400:
I still don’t’ get it, how can something that is “running the show” also just be a random string of nucleotides?
Umm nucleotides are hardware. You can't see the immaterial information, ie software. You can only see its effects.
Again, if any old string of nucleotides can help to “run the show” then biological function is pretty easy to come by?
The nucleotides = available/ blank disc space or RAM- It ain't the information, it's just a place for the information to reside.Joe
May 23, 2014
May
05
May
23
23
2014
11:52 AM
11
11
52
AM
PDT
Piotr: There is misrepresentation and misrepresentation. My arguments may be strange, but I have tried to detail them as much as possible, and to take in serious consideration the (few) comments and objections. Your strange "summary" does not even appear to be a comment, and frankly I can't see any resemblance to what I have tried to argue here in it. However, I appreciate your attention.gpuccio
May 23, 2014
May
05
May
23
23
2014
11:27 AM
11
11
27
AM
PDT
Gpuccio: As Joe Felsenstein kept patiently explaining on Sandwalk (and chastising everybody, including Larry, for their "fixation on fixation"), the number of fixed mutations doesn't matter at all, because the differences between human and chimp reference sequence may well be between alleles that are not yet fixed in either species (and of course many of those that are fixed now were at polymorphic loci in the Homo/Pan MRCA). But Larry's purpose was not to prove that hominin evolution has been neutral. He wanted to show that even random drift alone (the evolutionary "noise") may realistically produce a difference of the observed order in the time available. He made his point. I find your argument strange. We don't see it, we don't understand it, we don't know to what extent it may be conserved, we have no direct evidence of its existence: therefore it must be real, intricate, and important. Looks more like wishful thinking than logic to me.Piotr
May 23, 2014
May
05
May
23
23
2014
11:17 AM
11
11
17
AM
PDT
Joe, I still don't' get it, how can something that is "running the show" also just be a random string of nucleotides? Again, if any old string of nucleotides can help to "run the show" then biological function is pretty easy to come by?wd400
May 23, 2014
May
05
May
23
23
2014
10:36 AM
10
10
36
AM
PDT
Piotr: Maybe I am misrepresenting Moran. If that is the case, I apologize with him and with you. But frankly, I don't think that is true. I can only, in complete honesty, say again how I read his argument. a) He assumes a specific hypothesis which generates a prediction: "Evolutionary theory predicts that the rate of change should correspond to the mutation rate since most of the differences are due to neutral substitutions in junk DNA." So, the hypothesis is that "most of the differences are due to neutral substitutions in junk DNA", IOWs, that most DNA is junk, and therefore most of the mutations happen in non functional elements, and are therefore neutral. The prediction is that "the rate of change should correspond to the mutation rate". IOWs that if he measures, independently, the mutation rate, and the observed number of fixed mutations, he should find a good correspondence. That's, again, because he believes that his assumption, that most of the mutations happen in non functional elements, and are therefore neutral, is true. IOWs, Moran is making a prediction, then goes to verify it, to confirm his assumption. b) He evaluates the mutation rate: "We know that the mutation rate is about 130 mutations per generation based on our knowledge of biochemistry. This rate has been confirmed by direct sequencing of parents and children " b) He goes on with the verification: "Let’s see if it works." c) He evaluates the observed number of fixed mutations as follows: "The human and chimp genomes are 98.6% identical or 1.4% different. That difference amounts to 44.8 million base pairs distributed throughout the entire genome. If this difference is due to evolution then it means that 22.4 million mutations have become fixed in each lineage (humans and chimp) since they diverged about five million years ago. The average generation time of chimps and humans is 27.5 years. Thus, there have been 185,200 generations since they last shared a common ancestor if the time of divergence is accurate. (It's based on the fossil record.) This corresponds to a substitution rate (fixation) of 121 mutations per generation and that's very close to the mutation rate as predicted by evolutionary theory." d) He concludes that the two numbers are extremely similar: "Now, I suppose that this could be just an amazing coincidence. Maybe it's a fluke that the intelligent designer introduced just the right number of changes to make it look like evolution was responsible. Or maybe the IDiots have a good explanation that they haven't revealed?" So, maybe he has other arguments to conclude what he concludes, but the reasoning here is clear enough: my prediction has been confirmed, and only IDiots can believe that this is a coincidence. Therefore, the verification of my prediction confirms my assumptions. What were the assumptions? "most of the differences are due to neutral substitutions in junk DNA". That's my understanding of what he says.gpuccio
May 23, 2014
May
05
May
23
23
2014
10:33 AM
10
10
33
AM
PDT
wd400- You just don't get it. In a design scenario there is software running the show. That software needs a place to reside in/ on the DNA. The RNAs formed from the DNA are akin to the data, ie ones and zeros, on computer busses. Some data is in/ on the functioning/ coding parts. Some of it is in the other regions.Joe
May 23, 2014
May
05
May
23
23
2014
10:32 AM
10
10
32
AM
PDT
wd400: I don't think that RNA's function is really sequence independent, but that we don't know how much and how it is sequence dependent. It can be true that we should see some sequence conservation for those functions which do not change for functional reasons, but I am not sure at all that the context you quoted (population datasets) is sensitive enough to detect that signal. I think that we need more data to understand really what is functional, what is conserved, what is functional in different ways, and what expresses quickly changing functions. Moreover, the analysis should be as specific as possible for different types of non coding DNA, because they certainly behave very differently.gpuccio
May 23, 2014
May
05
May
23
23
2014
10:09 AM
10
10
09
AM
PDT
It what sense are those nts storing something? a 10nt gap sounds like an empty box to me?wd400
May 23, 2014
May
05
May
23
23
2014
09:19 AM
9
09
19
AM
PDT
wd400:
If functional sequences can remain funtional while accepting any-old mutation then the sequence-space must be full of biological function.
Only some functional sequences can remain functional while accepting an old mutation. If all you need is a 10 nucleotide-long space for storage, then all you need is ten nucleotides that are not already doing something. I can store things in a cardboard box. I can also store things in a milk crate, wooden crate, plastic storage bin, etc.Joe
May 23, 2014
May
05
May
23
23
2014
09:16 AM
9
09
16
AM
PDT
Just to discuss, suppose that some sequence stores information through a different code that we don’t understand... If this 'code' was sequence specific then we'd detect the signatures of purifying selection (decreased diversity, skewed allele frequencies....) if it was in operation. If functional sequences can remain funtional while accepting any-old mutation then the sequence-space must be full of biological function.wd400
May 23, 2014
May
05
May
23
23
2014
08:58 AM
8
08
58
AM
PDT
Piotr, Seeing that Larry Moran cannot produce a testable hypothesis for unguided evolution producing a bacterial flagellum, no one really cares what he says as most likely it is meaningless to science.Joe
May 23, 2014
May
05
May
23
23
2014
06:51 AM
6
06
51
AM
PDT
gpuccio @ 74
There is an important point which should be considered. We are probably really underestimating the importance of short molecules in functional regulation. Both peptides and RNAs. Indeed, while basic biochemical function usually requires long molecules, the regulation of what those long molecules do can easily be obtained by short molecules. That means that there can be an intricate regulation network that we not only don’t understand, but essentially don’t see.
I could not have said it better, hence I have nothing to add to what you wrote, except to remind ourselves that we ain't seen nothing yet... more discoveries are coming, the best part is still ahead :) Let's enjoy it!Dionisio
May 23, 2014
May
05
May
23
23
2014
04:02 AM
4
04
02
AM
PDT
You misrepresent Larry's position. He doesn't say or imply that "neutral theory is correct" = "most of the genome must be junk". He does assume that most of human DNA is junk and therefore the separate evolution of humans and chimps has been neutral for the most part, but this assumption is based on various other evidence which he discusses elsewhere (including e.g. mutational load).Piotr
May 23, 2014
May
05
May
23
23
2014
12:19 AM
12
12
19
AM
PDT
Piotr at #75:
It gets even worse for the IDiots. Evolutionary theory predicts that the rate of change should correspond to the mutation rate since most of the differences are due to neutral substitutions in junk DNA. ... If evolutionary theory (population genetics) is correct, and if David Klinhoffer and chimps/bonobos actually evolved from a common ancestor, then we should observe a correspondence between the percent similarity of Klinghoffer and chimps and the predicted number of changes due to evolution. Let's see if it works. ... The average generation time of chimps and humans is 27.5 years. Thus, there have been 185,200 generations since they last shared a common ancestor if the time of divergence is accurate. (It's based on the fossil record.) This corresponds to a substitution rate (fixation) of 121 mutations per generation and that's very close to the mutation rate as predicted by evolutionary theory. Now, I suppose that this could be just an amazing coincidence. Maybe it's a fluke that the intelligent designer introduced just the right number of changes to make it look like evolution was responsible. Or maybe the IDiots have a good explanation that they haven't revealed?
Emphasis mine.gpuccio
May 22, 2014
May
05
May
22
22
2014
11:53 PM
11
11
53
PM
PDT
Mung at #70: OK, I was not really "responding" to you, just using something you had said as a starting point to clarify some concepts. :)gpuccio
May 22, 2014
May
05
May
22
22
2014
11:44 PM
11
11
44
PM
PDT
wd400 at #66: Just to discuss, suppose that some sequence stores information through a different code that we don't understand. In the genetic coed (protein coding genes) we know that some mutations do not change the meaning of the coded information. Those are called synonymous, are considered by definition neutral (even if now we know that it is not always true), and they are not considered a sign that the sequence is not functional. Instead, we consider the rate of non synonymous mutations as an indicator of how much neutral mutations that sequence can tolerate, and therefore of the purifying selection acting on it. But if the code were different, we should reason differently. Other types of mutations would be "synonimous", and other would be "non synonimous". I simply mean that we cannot by default apply the concepts derived from protein coding genes to non protein coding genes, or to any sequence of which we don't understand the sequence/structure/function relationship.gpuccio
May 22, 2014
May
05
May
22
22
2014
11:42 PM
11
11
42
PM
PDT
My point was, instead, that Moran uses the assumption that most mutations are neutral to prove/suggest that most of the genome is non functional...
Again, I have to ask you: where exactly does Larry Moran prove/suggest such a thing? I somehow find it hard to imagine him guilty of such a gross non sequitur. N(early n)eutral theory explains why the accumulation of junk DNA is possible; it doesn't claim that most of the genome must be junk. It predicts that junk content may be highly variable, from almost none to truckloads of it. Of course if "the genome" is specifically the human genome (or indeed any typical eukaryote genome, leaving aside special cases like the bladderwort and the pufferfish), there are well known reasons to argue that a lot of it consists of junk DNA. But it isn't a conclusion drawn directly from neutral theory. I am sure Larry Moran must have mentioned some other evidence. By the way, if there is a pub anywhere in the world where genome researchers meet for a pint or two after work, "The Bladderwort and the Pufferfish" would be a terrific name for it.Piotr
May 22, 2014
May
05
May
22
22
2014
11:38 PM
11
11
38
PM
PDT
Dionisio at #64: Really interesting (and the article here is free!). There is an important point which should be considered. We are probably really underestimating the importance of short molecules in functional regulation. Both peptides and RNAs. Indeed, while basic biochemical function usually requires long molecules, the regulation of what those long molecules do can easily be obtained by short molecules. That means that there can be an intricate regulation network that we not only don't understand, but essentially don't see.gpuccio
May 22, 2014
May
05
May
22
22
2014
11:32 PM
11
11
32
PM
PDT
wd400 at #61: I understand you are answering Joe's point about non sequence depending function (such as spacer function). I would like to mention, however, that my point 3) was not about that, but rather about function which allow different forms of sequence variation versus what we observe in protein coding genes, because the sequence/structure/function relationship is different. For example, the sequence/structure relationship in regulatory RNAs is certainly different than in coding DNA: the first one is related to the direct sequence of nucleotides and to the way it determines the final structure of the molecule according to chemical laws. The second one is related to the symbolic meaning of codons, and therefore to the chemical properties of another type of molecule. That certainly makes a difference. Just as an example, in genes which code for RNAs we cannot use the concept of synonimous mutations. In general, the problem is that we understand less of how regulatory RNAs work. And therefore, of how variation of sequence affects the functional space.gpuccio
May 22, 2014
May
05
May
22
22
2014
11:19 PM
11
11
19
PM
PDT
Piotr at #60: I am aware of that. I never said that Moran denied the importance of adaptive evolution. That was probably VJ's impression in the beginning, and VJ is probably less cynical than I am. I know that non design thinkers will always go back to NS to try to make sense, even if they strangely evade the subject when asked for detailed models or empirical supports to that point. :) My point was, instead, that Moran uses the assumption that most mutations are neutral to prove/suggest that most of the genome is non functional, which is a circular reasoning because, as I have tried to show, if an important part of the mutations is not neutral, but functional in a way we still do not understand, then an important part of variation must be subtracted to the observed mutations to get the true observed neutral fixation rate, and then that observed rate would be definitely lower than what we would expect from the total mutation rate, which would point to an important part of the genome being functional and therefore under purifying selection. My point is: the argument is circular anyway. We cannot understand if the genome is functional or not by looking at the apparent rate of fixation, unless we know if that fixation is due to function or not. IOWs, unless we already know how much of the genome is functional.gpuccio
May 22, 2014
May
05
May
22
22
2014
11:09 PM
11
11
09
PM
PDT
Partially off topic: Creation.com has a new series of articles on the information issue here: http://creation.mobi/a/9559 Here is the opening paragraph of the last article in the series. Maybe there I is some helpful information there for IDers. Any thoughts?
In parts 1 and 2 of this series the work of various information theoreticians was outlined, and reasons were identified for needing to ask the same questions in a different manner. In Part 3 we saw that information often refers to many valid ideas but that the statements reflect we are not thinking of a single entity, but a system of discrete parts which produce an intended outcome by using different kinds of resources. We introduced in Part 3 the model for a new approach, i.e. that we are dealing with Coded Information Systems (CIS). Here in Part 4 the fundamental theories for CIS Theory are presented and we show that novel conclusions are reached.
tjguy
May 22, 2014
May
05
May
22
22
2014
11:08 PM
11
11
08
PM
PDT
gpuccio @ 32, were you responding to my post @ 28? That post was written to respond to something Piotr had written in a post @ 13. So if you were thinking it was a response to something you had written it wasn't. :)Mung
May 22, 2014
May
05
May
22
22
2014
07:56 PM
7
07
56
PM
PDT
gpuccio:
So, my point is: we don’t know why sequences change. The default explanation that they change mostly for random neutral mutations is just that, a default explanation which simply ignores the possibility that they change for functional reasons,
Masatoshi Nei:
In other words, we could use the neutral theory as a null hypothesis for studying molecular evolution.
Mung
May 22, 2014
May
05
May
22
22
2014
07:36 PM
7
07
36
PM
PDT
wd400:
If that were the case there would need to be some mechanism by which the “storage” sequences were shielded from mutation, which would show up in these sorts of studies.
No, storage space is NOT sequence specific. It is just a physical place along the DNA sequence, that isn't used for anything else.Joe
May 22, 2014
May
05
May
22
22
2014
06:04 PM
6
06
04
PM
PDT
Dionisio @ 62
Principles and Properties of Eukaryotic mRNPs Sarah F. Mitchell, Roy Parker DOI: http://dx.doi.org/10.1016/j.molcel.2014.04.033 The proper processing, export, localization, translation, and degradation of mRNAs are necessary for regulation of gene expression. These processes are controlled by mRNA-specific regulatory proteins, noncoding RNAs, and core machineries common to most mRNAs. These factors bind the mRNA in large complexes known as messenger ribonucleoprotein particles (mRNPs). Herein, we review the components of mRNPs, how they assemble and rearrange, and how mRNP composition differentially affects mRNA biogenesis, function, and degradation. We also describe how properties of the mRNP “interactome” lead to emergent principles affecting the control of gene expression.
These processes are controlled by mRNA-specific regulatory proteins, noncoding RNAs, and core machineries common to most mRNAs. Where do these mRNA-specific regulatory proteins, noncoding RNAs, and core machineries common to most mRNAs come from? How?Dionisio
May 22, 2014
May
05
May
22
22
2014
05:51 PM
5
05
51
PM
PDT
1 2 3 4 5

Leave a Reply