Darwinism Genetics Intelligent Design

Breaking: A “junk DNA” jumping gene is critical for embryo cell development

Spread the love
Not junk: 'Jumping gene' is critical for early embryo
Two-cell mouse embryo stained for LINE1 RNA (magenta/Ramalho-Santos lab, UCSF

This was discovered by someone who was skeptical of the idea that our geomes are largely useless junk. From Nicholas Weiler at Phys.Org:

A so-called “jumping gene” that researchers long considered either genetic junk or a pernicious parasite is actually a critical regulator of the first stages of embryonic development, according to a new study in mice led by UC San Francisco scientists and published June 21, 2018 in Cell.

Only about 1 percent of the human genome encodes proteins, and researchers have long debated what the other 99 percent is good for. Many of these non–protein coding regions are known to contain important regulatory elements that orchestrate gene activity, but others are thought to be evolutionary garbage that is just too much trouble for the genome to clean up.

For example, fully half of our DNA is made up of “transposable elements,” or “transposons,” virus-like genetic material that has the special ability of duplicating and reinserting itself in different locations in the genome, which has led researchers to dub them genetic parasites. Over the course of evolution, some transposons have left hundreds or thousands of copies of themselves scattered across the genome. While most of these stowaways are thought to be inert and inactive, others create havoc by altering or disrupting cells’ normal genetic programming and have been associated with diseases such as certain forms of cancer.

Now UCSF scientists have revealed that, far from being a freeloader or parasite, the most common transposon, called LINE1, which accounts for fully 24 percent of the human genome, is actually necessary for embryos to develop past the two-cell stage. More.

“evolutionary garbage that is just too much trouble for the genome to clean up”?  Yes, because Darwinism has predicted that.

Hat tip: PaV. He sent us this while travelling, adding,

LINE1, which makes up 24% of the genome is NOT “junk,” but an essential part of embryonic development.

The Darwinists are now just completely wrong. IDists predicted this. They pooh-poohed it. Well, they have five tons of egg on their face right now.

NOTA BENE: regarding the “transposons,” it’s quite interesting that it is involved with embryonic development since they are finding that “pseudogenes” are involved in brain (embryonic) development.

IOW, what’s “essential” is what the Darwinists called “junk” (And IDists called fundamental), and what was considered “essential” is only secondarily so.

Alas, no, PaV. Darwinists will simply say that Darwinism predicts this too. It’s all part of the non-falsification package. All that is lacking is a believing public.

From Richard Harris at NPR:

The noted biologist Barbara McClintock, who died in 1992, discovered these odd bits of DNA decades ago in corn, and dubbed them “jumping genes.” (She won a Nobel prize for that finding in 1983.) McClintock’s discovery stimulated generations of scientists to seek to understand this bizarre phenomenon.

Some biologists have considered these weird bits of DNA parasites, since they essentially hop around our chromosomes and infect them, sometimes disrupting genes and leaving illness in their wake. But Miguel Ramalho-Santos, a biologist at the University of California, San Francisco, doesn’t like that narrative.

“It seemed like a waste of this real estate in our genome — and in our cells — to have these elements and not have them there for any particular purpose,” Ramalho-Santos says. “So we just asked a very simple question: Could they be doing something that’s actually beneficial?” More.

“Could they be doing something that’s actually beneficial?” To understand why no one wondered before, one must understand the power of Darwinian groupthink, enforced by wrecking careers. In short, ID guys Jonathan Wells was right and Richard Dawkins was wrong. So was Jerry Coyne. And Michael Shermer.

See also: Note: One junk DNA defender just isn’t doing politeness anymore. In a less Darwinian science workplace, that could become more a problem for him than for his colleagues.

See also: Junk DNA can actually change genitalia.

Junk DNA: Darwinism evolves swiftly in real time

At Quanta: Cells need almost all of their genes, even the “junk DNA”

“Junk” RNA helps regulate metabolism

Junk DNA defender just isn’t doing politeness any more.

Anyone remember ENCODE? Not much junk DNA? Still not much. (Paper is open access.)

Yes, Darwin’s followers did use junk DNA as an argument for their position.

Another response to Darwin’s followers’ attack on the “not-much-junk-DNA” ENCODE findings

164 Replies to “Breaking: A “junk DNA” jumping gene is critical for embryo cell development

  1. 1
    PaV says:

    The following is from Sandwalk. Larry Moran wrote this just this past February:

    So far, so good. I disagree with their description of the rest of the genome. They imply that most of it is selfish DNA composed of transposons like Alu’s and LINE-1 sequences. I wish they had put more emphasis on the fact that much of our genome consists of defective transposons and viruses that are junk, plain and simple. They aren’t selfish DNA today, although they once were in the past.

    As I mentioned to Denyse: lots of egg on their face. And it will ONLY get worse.

  2. 2
    gpuccio says:

    This is very interesting!

    So, transposons are not only the most likely design tool in shaping novelties in genome evolution, as I have always argued: they are also important regulators of development. 🙂

    This is the very interesting conclusion of the paper:

    The interaction of LINE1 RNA with binding partners, such as Nucleolin is expected to be mediated
    by RNA secondary structure, which is less constrained by primary sequence than protein-coding regions. Thus, rather than being a vulnerability, the regulation of early development by TEs may allow both robustness, due to the repeated nature of TEs, and adaptability, due to their rapid evolution and their potential to support transposition in conditions of stress. In this regard, it is interesting that the percentage of the genome occupied by LINE1 elements seems to have sharply increased with development of therian mammals (e.g., Ivancevic et al. [2017]). The exploration of the function of LINE1 in other species should shed light on the role of TEs in shaping the evolution of development.

    Transposons are definitely the future.

    Thanks to PaV for finding this and to Denyse for reporting it. 🙂

  3. 3
    gpuccio says:

    PaV:

    As a further answer to Larry Moran, always from the paper you found:

    “Our results indicate that chromatin-associated LINE1 RNA regulates gene expression and developmental potency without requiring retrotransposition activity.”

    IOWs, it’s true, like Moran says, that many of our transposons are “defective”, in the sense of having lost retrotransposition activity, but that seems not to prevent them from being functional, in a very refined way!

    Indeed, losing retrotransposition activity could be an important functional development. Again from the paper:

    “This role of LINE1 as a chromatin-associated RNA therefore avoids the potential detrimental effects of LINE1 retrotransposition that have been reported in several disease states, including cancer”

  4. 4
    bill cole says:

    gpuccio

    A so-called “jumping gene” that researchers long considered either genetic junk or a pernicious parasite is actually a critical regulator of the first stages of embryonic development, according to a new study in mice led by UC San Francisco scientists and published June 21, 2018 in Cell.

    I once asked Larry if the low level of transcription he was measuring was during embryo development. He was not making measurements at this stage and I think this probably where most the activity.

    BTW: Have you seen the post Joe dedicated to you at TSZ? I starting doing blasts of proteins I had previously studied and found one that is 99%+ identical between humans and a finch specie. 200 million years of almost perfect preservation. 780 AA long 🙂

  5. 5
    Amblyrhynchus says:

    LINE1, which makes up 24% of the genome is NOT “junk,” but an essential part of embryonic development.

    There are ~800,000 copies of L1 in the human genome. To make L1 RNA you need… one. Not sure you can put the whole of that 24% into the non-junk category just yet.

  6. 6
    PaV says:

    Ambly:

    The study’s authors have talked about the importance of there being an abudance of this tanscript, and that this abudance of copies helps assure having a faithful copy, but also of providing numerous opportunities for mutation, if needed.

    It’s interesting that when IDers point out that ‘copied’ information isn’t “new” information, this is not accepted by Darwinists who claim that this is, indeed, “new” information. But now that there are thousands of copies which are deemed useful to an organism, we’re told that this is nothing more than just a ‘copy’ of ONE element.

    Very interesting. In the case of 800,000 copiesof LINE1, the actual numbers of the element are seen as being needed. IOW, copies here are important. However, in the case of ‘information,’ obviously a ‘copy’ is not an advance in what normal people consider information to be.

    An example: at a high school, there might be 100 copies of a biology book; but the teacher, at home, only needs ONE copy. Very different circumstances.

    The more principled view of these matters rests with the ID community.

  7. 7
    Amblyrhynchus says:

    Ribosomal RNA can make up ~80% of a cell’s RNA, but we manage to get by with a couple of hundred copies of the DNA that. I don’t think we need 800,000 copies of L1.

    LINEs are elements that can copy themselves. When the 798,981st copy fixed in the human genome do you think it did so because 798,980 just wasn’t enough, and one more was going to make a difference. Or was it maybe because one more copy makes no difference so nothing prevented one more copy accumulating in the genome.

  8. 8
    gpuccio says:

    Amblyrhynchus (and PaV):

    I think you are making a very basic error.

    The different copies of a transposable element are not identical. Not at all. They can of course be recognized as members of a specific family of retrotransposons, in this case L1, but they are different. See for example here:

    High Levels of Sequence Diversity in the 5′ UTRs of Human-Specific L1 Elements

    https://www.hindawi.com/journals/ijg/2012/129416/

    Approximately 80 long interspersed element (LINE-1 or L1) copies are able to retrotranspose actively in the human genome, and these are termed retrotransposition-competent L1s. The 5? untranslated region (UTR) of the human-specific L1 contains an internal promoter and several transcription factor binding sites. To better understand the effect of the L1 5? UTR on the evolution of human-specific L1s, we examined this population of elements, focusing on the sequence diversity and accumulated substitutions within their 5? UTRs. Using network analysis, we estimated the age of each L1 component (the 5? UTR, ORF1, ORF2, and 3? UTR). Through the comparison of the L1 components based on their estimated ages, we found that the 5? UTR of human-specific L1s accumulates mutations at a faster rate than the other components. To further investigate the L1 5? UTR, we examined the substitution frequency per nucleotide position among them. The results showed that the L1 5? UTRs shared relatively conserved transcription factor binding sites, despite their high sequence diversity. Thus, we suggest that the high level of sequence diversity in the 5? UTRs could be one of the factors controlling the number of retrotransposition-competent L1s in the human genome during the evolutionary battle between L1s and their host genomes.

    Not only they are different in sequence, they are of course different in their specific positions in the genome, due to the more or less ancient retrotransposition activity during evolution.

    Moreover, the functional role described in the paper referenced in the OP is implemented by LINE 1 RNA, and is a role linked to RNA structure. The functional structure of that RNA can vary a lot according to the way different basic “modules” (in this case, L1 elements), which are however significantly different one from the other, are transcribed.

    IOWs, TEs are not at all identical and repetitive modules, where all the information is already implemented in one copy.

    The opposite is true: TEs are highly different modules that can build highly different complex combinatorial structures, for example at the RNA level. So, the relevant information is not in the original prototype of the module, but rather in the variation in the individual modules and in their different combinations.

    Therefore, your argument is completely out of order.

  9. 9
    gpuccio says:

    bill cole:

    Yes, highly specific regulators are often confined to specific cells or states. Only housekeeping genes are ubiquitously expressed.

    Yes, I have seen the OP by Joe Felsestein at TSZ, and I have answered it in some detail here:

    https://uncommondescent.com/intelligent-design/defending-intelligent-design-theory-why-targets-are-real-targets-propabilities-real-probabilities-and-the-texas-sharp-shooter-fallacy-does-not-apply-at-all/

    Comments #394, 395, 397, 400, 403, 406, 407, 408.

    At that point, he had not yet answered my new comments. Frankly, I have not followed the thread after that, and now I am a little discouraged at the perspective of going thorugh 939 comments!

    Are you aware of some more recent comment by Joe Felsestein about my comments on his OP? I would be specially interested in any comment from him about my “thief” thought experiment, which was specifically addressed to him. He has not addressed that specific issue in his OP about my thoughts, and yet I believe that the thief experiment is extremely pertinent and clarifying to the discussion about functional complexity.

    So, I was really disappointed that he did not address it in his OP.

  10. 10
    ET says:

    bill and gpuccio- It is very telling that neither Joe Felsenstein nor any other TSZ contributor has provided any evidence that blind and mindless processes can produce any genes that code for proteins. They can only hoot and whine about 500 bits of functional information.

    It is really sad to see that such pathetic people still exist.

  11. 11
    PaV says:

    gpuccio:

    Thank you for the information you’ve given us. What you’ve posted makes for a better understanding of these multiple LINE elements and the possible role they play.

    The thought that strikes me is that perhaps these LINE elements play some kind of role in preserving the fidelity of protein transcripts prior to subsequent translation–IOW, the transcribed mRNA from one such genomic unit containing a LINE is first ‘compared’ to the LINE element itself prior to translation in the ribosome. It would make sense that this would occur early on in embryonic activity where ‘fidelity’ in translation would likely be very critical for overall development.

    I would hope you could comment on this thought.

  12. 12
    Mung says:

    gpuccio:

    …and now I am a little discouraged at the perspective of going thorugh 939 comments!

    It should be easy to see which ones are by Joe though. As the author of the OP his posts in the thread have a special coloring to them.

    The main complaint seems to be that you can’t rule out “evolutiondidit” and expect evolutionists to show that it can. They expect you to show that evolution can’t. Then they accuse you of asking THEM to prove a negative.

  13. 13
    gpuccio says:

    Mung:

    I will see what I can do. The problem is that the comments are spread on many pages, so you cannot just do a search in the whole thread, or at least I suppose so.

    “The main complaint seems to be that you can’t rule out “evolutiondidit” and expect evolutionists to show that it can. They expect you to show that evolution can’t. Then they accuse you of asking THEM to prove a negative.”

    That I have answered many times, and I don’t want to repeat always the same things. That would be more repetitive than a transposon! 🙂

  14. 14
    gpuccio says:

    PaV:

    Always a pleasure to hear from you! 🙂

    I think that we still understand very little about the functions of TEs.

    My personal idea is that they have two completely different kinds of functions:

    a) By their guided retrotransposition activity in genomes they can slowly implement genomic changes in the course of evolutionary history. IOWs, they can be very important design tools.

    The scientific literature is rather abundant about the role of TEs in promoting new genes, new regulations and so on in the course of natural history. Luckily, TEs almost always leave specific signatures that can be recognized, for example when they build new genes from non coding DNA, or when they silence genes creating pseusogenes, sometimes functional pseudogenes.

    Of course, neo-darwinists consider that evidence as random emergence of function, as they always do. But I think we know better! 🙂

    b) Then there is the possible regulatory roles of TEs as they are in a specific species, IOWs their functions which are independent from retrotransposition. those functions that, according to Larry Moran, should not exists, and that are instead the subject of the paper you discovered.

    I think we really know very little about those functions. The quoted paper is probably one of the most detailed about that issue, and it strongly suggests that many functions are implmented by TE RNA, IOWs by the transcriptional outcome of genomic TEs.

    Tis is a fascinating field, but I believe that we have to wait for more data to get some general idea.

    TEs and their sequence and spatial organization can potentially influence nuclear regulation in many ways:

    1) They make up a big part of the genome, so their spatial disposition can certainly have a major role in determining the complex functional regulation of chromatin states.

    2) Transcribed RNA including TEs, as suggested by the quoted paper, can have a lot of complex regulatory interaction with all kinds of nuclear processes.

    We know that non coding RNAs are abundant, important and complex. And we don’t understand well their many functions. We know a few things, for example that those functions are probably more related to their structure than to their strict sequence, which could explain why they are in general less conserved than proteins, even if functional. And we know specific examples of specific regulations.

    TE containing RNAs are really a new aspect, which has been widely ignored in the past years, and is now beginning to gain attention. There are many reasons for that scarce attention to the problem in scientific literature: some of them are technical (those RNAs are difficult to study) and others are ideological (brilliant neo-darwinists have already decided that they are junk).

    I would suggest the following, very interesting paper:

    Stable C0T-1 Repeat RNA Is Abundant and Is Associated with Euchromatic Interphase Chromosomes

    https://www.cell.com/cell/fulltext/S0092-8674(14)00135-4

    Recent studies recognize a vast diversity of noncoding RNAs with largely unknown functions, but few have examined interspersed repeat sequences, which constitute almost half our genome. RNA hybridization in situ using C0T-1 (highly repeated) DNA probes detects surprisingly abundant euchromatin-associated RNA comprised predominantly of repeat sequences (C0T-1 RNA), including LINE-1. C0T-1-hybridizing RNA strictly localizes to the interphase chromosome territory in cis and remains stably associated with the chromosome territory following prolonged transcriptional inhibition. The C0T-1 RNA territory resists mechanical disruption and fractionates with the nonchromatin scaffold but can be experimentally released. Loss of repeat-rich, stable nuclear RNAs from euchromatin corresponds to aberrant chromatin distribution and condensation. C0T-1 RNA has several properties similar to XIST chromosomal RNA but is excluded from chromatin condensed by XIST. These findings impact two “black boxes” of genome science: the poorly understood diversity of noncoding RNA and the unexplained abundance of repetitive elements.

    From what I have read, I would think that one of the main ways that TE RNAs can have important regulatory roles is by interacting, in different and complex ways, with chromatin dynamic architecture, both directly and thorugh interaction with other players.

    They could certainly be involved in fidelity in translation, as you suggest, but I cannot offer at present any special support for that kind of role, as far as I know. Have you any data about that?

  15. 15
    gpuccio says:

    bill cole and Mung:

    This seems to be the most recent example of Joe Felsestein’s arguments:

    No. The shoe is on the other foot. You and gpuccio stated a general rule that 500 bits of FI is impossible by ordinary evolutionary processes. Asked why, you provide no reason that this is generally true, but demand that we disprove it instead.

    Yawn …

    If he is bored, I am bored even more. Frankly, I expected something more from him.

    First of all, I have stated a general rule that 500 bits of FI is impossible by any non design process, which is quite a different rule.

    The support for that is completely empirical: no example is known of complex functional information arising out of design.

    The idea is also supported by a strong rationale: complex functions are not the sum of simpler steps, but derive fron the general organization of simpler elements of information. IOWs, there are no gradual steps to real complexity.

    The idea is also supported by what we know about the design process. IOWs, why are designers capable of doing what other realities can never achieve?

    The answer is simple: because conscious designers understand things, and they desire specific outcomes, and they can organize matter to satisfy those desires by using their understanding (IOWs, the subjective cognitive experiences of meaning and purpose are the only reality that can build complex functional information).

    My example of the thief, which can be found here:

    The Ubiquitin System: Functional Complexity and Semiosis joined together.

    https://uncommondescent.com/intelligent-design/the-ubiquitin-system-functional-complexity-and-semiosis-joined-together/#comment-656365

    #823, #831, #859, #882, #919

    Again, I paste the pertinent part from comment #919:

    The thief mental experiment can be found as a first draft at my comment #823, quoted again at #831, and then repeated at #847 (to Allan Keith) in a more articulated form.

    In essence, we compare two systems. One is made of one single object (a big safe). the other of 150 smaller safes.

    The sum in the big safe is the same as the sums in the 150 smaller safes put togethjer. that ensures that both systems, if solved, increase the fitness of the thief in the same measure.

    Let’s say that our functional objects, in each system, are:

    a) a single piece of card with the 150 figures of the key to the big safe

    b) 150 pieces of card, each containing the one figure key to one of the small safes (correctly labeled, so that the thief can use them directly).

    Now, if the thief owns the functional objects, he can easily get the sum, both in the big safe and in the small safes.

    But our model is that the keys are not known to the thief, so we want to compute the probability of getting to them in the two different scenarios by a random search.

    So, in the first scenario, the thief tries the 10^150 possible solutions, until he finds the right one.

    In the second scenario, he tries the ten possible solutions for the first safe, opens it, then passes to the second, and so on.

    A more detailed analysis of the time needed in each scenario can be found in my comment #847.

    So, I would really appreciate if you could answer this simple question:

    Do you think that the two scenarios are equivalent?

    What should the thief do, according to your views?

    This is meant as an explicit answer to your statement mentioned before:

    “That counts up changes anywhere in the genome, as long as they contribute to the fitness, and it counts up whatever successive changes occur.”

    The system with the 150 safes corresponds to the idea of a function that include changes “anywhere in the genome, as long as they contribute to the fitness”.

    The system with one big safe corresponds to my idea of one single object (or IC system of objects) where the function (opening the safe) is not present unless 500 specific bits are present.

    This shows very clearly the difference between complex functional information and simple functional information, and why simple functional information does not lead to complex functional information in known scenarios, as darwinists firmly believe for mysterious reasons which seem to be clear only to themselves.

    It is also a specific criticism to a very fundamental statement made by Joe Felsestein himself, and I am not aware of any answer from him (if it is not lost in the 900+ comments to his thread).

    So, if the bored Felsestein could try and answer that, instead of repeating ad nauseam that an explanation unsupported by any evidence and by any rationale must be considered true in empirical science until and unless proved mathematically false, maybe I would be less bored myself! 🙂

  16. 16
    bill cole says:

    mung gpuccio

    The argument is, however, unable to rule out natural selection as it does not carry out pure random sampling.

    This is a statement Joe made early in the thread. He has admitted that we can rule out pure random sampling.

    The question is in my mind is does natural selection add any value if it takes 500 bits to get to a selectable function in the population of sequences? This then becomes the process the Joe admits fails.

    Maybe the very thin thread his argument is hanging on is that a selectable function is so nebulous that we cannot not nail it down.

    My thinking of a counter argument is that if there are not enough evolutionary resources to find a selectable sequence that has a path to what we are observing with gpuccio’s method then his argument fails. Empirically this appears to be case. How do we strengthen this position?

  17. 17
    bill cole says:

    gpuccio mung

    I think your thief argument is solid. Mung has beaten Joe up pretty badly on the conjecture that population genetics can explain NS adding FI. I think this argument of his will soon hit the wall.

  18. 18
    OLV says:

    gpuccio (15):

    “the subjective cognitive experiences of meaning and purpose are the only reality that can build complex functional information”

    Easy to falsify: just show one single example of complex functional information that did not required the subjective cognitive experiences of meaning and purpose to produce it.

    That’s all. Easy task.

  19. 19
    Amblyrhynchus says:

    gpuccio,

    I’m not sure we are talking about the same paper. This speculative stuff about a role emerging from the 800,000 copies of L1 (can you imagine your reaction if an evolutionary biologist said something so vague and whispy….) doesn’t seem to related to this paper at all. They knocked down L1 RNA, demonstrating the need for some of that transcript in early development. Nothing in the mechanisms requires a diversity of transcripts or the elements to be spread around the genome (a suggestion that makes even less sense in embryological development, when transposons are released from the epigenetic regulation that shuts them down in healthy somatic cells).

    No one doubts TEs contribute to host functions (as a source of raw material and regulatory seqs), but the idea that all 800,000 copies or L1 are there to help the host is very strange. It’s even more odd to imagine this result proves that.

  20. 20
    gpuccio says:

    Amblyrhynchus:

    I think the number 800000 for LINE 1 is probably not correct, being related more to the total number of LINE elements, while LINE 1 should be about 500000. However, that’s certainly not the problem.

    We are definitely talking about the same paper.

    The paper demonstrates, if we accept its conclusions, that nuclear RNA derived from LINE 1 elements has an important role in development, in mouse.

    For clarity, I paste here the highlights of the paper, as set by the authors:

    Highlights:

    – LINE1 RNA is abundant and nuclear in mouse ESCs and preimplantation embryos

    -LINE1 knockdown inhibits ESC self-renewal and induces transition to a 2C state

    – LINE1 RNA recruits Nucleolin/Kap1 to repress Dux and activate rRNA synthesis

    – In embryos, LINE1 inhibition causes persistence of the 2C
    program and impairs ZGA

    Now, as you can see, the paper, and the relative knockout experiment, is relative to “LINE 1 RNA” in general, that RNA that is “abundant and nuclear in mouse ESCs and preimplantation embryos”.

    We don’t know details about that RNA, but it is easy to assume that, like most nuclear non coding RNAs, it is probably a pool of different and varied components. It is also perfectly reasonable that this nuclear RNA can in principle be derived from transcription of all, or most, of the genomic LINE 1 DNA.

    The paper gives us no details of the sequences, length and structures of that RNA pool, and only some important but not yet detailed information about possible mechanisms of action.

    You argued that:

    “There are ~800,000 copies of L1 in the human genome. To make L1 RNA you need… one. Not sure you can put the whole of that 24% into the non-junk category just yet.”

    While I can agree that we are “not sure” that we can put the whole of that 24% into the non-junk category (that woukld certainly require further knowledge), your statement that “to make L1 RNA you need… one” is certainly wrong and misleading.

    As I have argued at #8, LINE 1 DNA, and therefore certainly LINE 1 derived RNA, is certainly varied and complex, not just the repetition of an identical component. I have also argued that the functional information in LINE 1 RNA is very probably linked to the RNA structure of its components, and that such a structure can be critically dependent on the sequence variation, position variation and general genomic organization of LINE 1 “copies”.

    So, it is certainly true that we need to know more, but it is equally true that your argument that LINE 1 DNA is just a redundant copy of one module, and that therefore functional LINE 1 RNA could be derived from just one transposon, is grossly wrong.

    Moreover, I think you are missing the novelty in this paper. Indeed, you say:

    “No one doubts TEs contribute to host functions (as a source of raw material and regulatory seqs)”

    But that’s simply not true. A lot of people doubt that. As said, Larry Moran doubts that a lot. He also seems to think that the lack of retrotransposition activity in most human TEs is proof that they are not functional.

    This paper is probably the first that demonstrates a definite function for LINE 1 RNA. It is important. And the function is completely unrelated to retrotransposition activity.

    Again, RNA structure is paramount for function. If we look at XIST, the best known example of a long, partially repetitive RNA with a definite important function, we can easily see that. I quote from Wikipedia:

    The human Xist RNA gene is located on the long (q) arm of the X chromosome. The Xist RNA gene consists of conserved repeats within its structure and is also largely localized in the nucleus.[6] The Xist RNA gene consists of an A region, which contains 8 repeats separated by U-rich spacers. The A region appears to contain two long stem-loop structures that each include four repeats.[12] An ortholog of the Xist RNA gene in humans has been identified in mice. This ortholog is a 15 kb Xist RNA gene that is also localized in the nucleus. However, the ortholog does not consist of conserved repeats.[13] The gene also consists of an Xist Inactivation Center (XIC), which plays a major role in X inactivation.[14]

    This, just to give an idea of how complex the structure-function relationship can be in long non coding RNAs.

  21. 21
    gpuccio says:

    bill cole at #16:

    The 500 bits rukle is relative to the complexity of a function: IOWs, a function which requires 500 bits to exist.

    My big safe in the thief example is a complex function: it requires about 500 bits to work.

    The 150 smaller safes require about 3.3 bits each to work.

    But the point is: solving the 150 smaller problems does not help in any way to solve the big proble, because there is no relationship between the 150 individual functions in the smaller safes and the big complex function in the big safe.

    If the big key works only when we have found the correct 500 bits, then there is no way to help find it by any form of “selection”, either natural or intelligent. We just have to try and search the space.

    The same is true for complex proteins (or software, or language, or machines). As I have said many times, the F0F1 component of ATP synthase is a complex machine, requiring hundreds of specific AAs to work. It is a big structure, which can efficiently link ADP phosphate and generate ATP when its configuration is regularly changed by the rotating component of the molecule.

    You cannot get that kind of machine part by a few AA positions: it requires a lot of specific information just to exists and work.

    Joe Felsestein’s “argument” about NS is completely non-existent: NS is a mechanism that has some role in optimizing (to a very limited extent) existing complex functional structures, as seen in the few documented cases. But it is impossible for any form of selection to proved a really complex structure, one that needs at least 500 specific bits of functional information just to exist and work.

    Indeed, Joe Felsestein, lacking any rationale and any empirical support for his imaginary theory, is forced to make wrong and misleading “statements”, like the following:

    “That counts up changes anywhere in the genome, as long as they contribute to the fitness, and it counts up whatever successive changes occur.”

    This is simply false. I have countered that wrong statement with my thief experiment. Joe Felsestein is simply wrong.

    To build a new complex protein with a new complex function we need specific changes in the sequence that will become the protein itself, and those changes must be linked to the future functional sequence: it is not enough that they “contribute to fitness”.

    Opening the smaller safes certainly contributes to fitness, but in no way it helps in finding the big key.

    It’s as simple as that, and Joe Felsestein is wrong.

  22. 22
    gpuccio says:

    bill cole:

    “I starting doing blasts of proteins I had previously studied and found one that is 99%+ identical between humans and a finch specie. 200 million years of almost perfect preservation. 780 AA long ”

    What is it?

  23. 23
    Amblyrhynchus says:

    your statement that “to make L1 RNA you need…one” is certainly wrong and misleading.

    How so? Your argument seems to be that L1 elements have some diversity, therefore that diversity is required for this function. But, of course, that’s no argument at all.

  24. 24
    bill cole says:

    Gpuccio
    Beta catenin. I have studied this protein for over a year as its deregulation is a primary cause of most cancers.

  25. 25
    bill cole says:

    gpuccio

    But the issue I am raising is whether there is some mathematical proof that all cases where we can have a set of sequences that have functional information greater than 500 bits cannot be reached by natural selection acting on less-functional sequences that are outside the set. Is there a mathematical proof? Something like William Dembski’s Law of Conservation of Complex Specified Information? (Like his, but not the same — his does not do the job).

    If he wants a mathematical proof I have suggested that all selectable functions become part of the equation or the numerator. If you divide all possible selectable functions by the sequence space and get greater then 500 bits you can “mathematically” infer design.

  26. 26
    gpuccio says:

    Amblyrhynchus:

    “How so? Your argument seems to be that L1 elements have some diversity, therefore that diversity is required for this function. But, of course, that’s no argument at all.”

    No. My argument is that the function has been shown for LINE 1 RNA, which if cioursw has a lot of diversity, like the DNA from which it is derived, probably much more (considering all the diversity that can be added by transcription).

    We don’t know how much of that diversity is required for the shown functions, but certainly the object that has been shown to be functional has a great diversity and complexity, and you cannot equal it to a repetition of one single module.

  27. 27
    gpuccio says:

    bill cole at #24:

    Thank you. I will give it a look.

  28. 28
    Amblyrhynchus says:

    No. My argument is that the function has been shown for LINE 1 RNA, which if cioursw has a lot of diversity, like the DNA from which it is derived, probably much more (considering all the diversity that can be added by transcription)….

    You say “no”, but nothing in what follows contradicts what I’ve said.

  29. 29
    gpuccio says:

    Amblyrhynchus:

    Let’s try to be clear about what we agree or disagree upon.

    My point is that LINE 1 RNA is a very complex object, deriving from LINE 1 DNA, which has a lot of diversity, with the added diversity derived from trancription procedures.

    The result is that it has the potential for a lot of complex functional information.

    The paper demonstrates, for the first time, that some important and complex functions, related to the first events in embryonic life, are connected to LINE 1 RNA.

    OK?

    Now, if your argument is only that we don’t know exactly how much of the potential information in LINE 1 RNA is needed for the described functions, I can agree with you.

    But, of course, we don’t know exactly” can very well mean that a lot of the information in LINE 1 RNA is needed. Or maybe only part of it. We don’t know means just that: we don’t know.

    But we do know that information in LINE 1 RNA is necessary for important functions. That is a big point, given that most biologists (for example Larry Moran) have practically excluded that, up to now.

    Moreover, the paper has just discovered some function linked to LINE 1 RNA, but of course others can well be discovered in the future.

    That said, the simple point is that your argument was very different.

    Yoyu have explicitly stated (comment #5):

    “There are ~800,000 copies of L1 in the human genome. To make L1 RNA you need… one.”

    So, you are arguing that all the information in LINE 1 DNA and in LINE 1 RNA is included in one single copy of LINE 1.

    That is completely wrong, and utterly misleading. You are ignoring, indeed denying, the information implicit in the variation of the modules, which are not at all identical copies, and in the spatial disposition of the modules, which is of paramount importance in non coding DNA, and in general in DNA and RNA structure.

    This is where you are wrong, and this is what I have pointed out.

    You repeat that error at comment #7:

    “Ribosomal RNA can make up ~80% of a cell’s RNA, but we manage to get by with a couple of hundred copies of the DNA that. I don’t think we need 800,000 copies of L1.”

    Again, this is highly misleading. Ribosomal RNA is highly conserved, and its role is to code for a specific organelle, the ribosome. The presence of many copies, here, is only necessary to provide the correct amount of trancription for the necessary ribosomes. This is from Wikipedia, from the “Ribosomal DNA” page:

    The rDNA transcription tracts have low rate of polymorphism among species, which allows interspecific comparison to elucidate phylogenetic relationship using only a few specimens. Coding regions of rDNA are highly conserved among species

    The situation is completely different for TEs, where the accumulation of copies generates diversity, both in sequence and in organization of the modules.

    You seem to ignore that important fact, indeed to deny it.

    Indeed, you say:

    “LINEs are elements that can copy themselves. When the 798,981st copy fixed in the human genome do you think it did so because 798,980 just wasn’t enough, and one more was going to make a difference. Or was it maybe because one more copy makes no difference so nothing prevented one more copy accumulating in the genome.”

    Again, you are reasoning just in terms of number of copies, assuming that the copies are identical, that their position in the genome has no relevance at all, and that even the number itself does not count.

    Nothing of that is true.

  30. 30
    gpuccio says:

    bill cole:

    Beta catenin already has a very high human conserved information in pre-vertebrates:

    1.508323 baa

    in non vertebrate deuterostomia, corresponding to:

    1178 bits

    However, it undergoes a final significant jump in cartilaginous fish:

    2.020487 baa

    corresponding to

    1578 bits

    with a jump in vertebrates of:

    0.5121639 baa

    400 bits

    However, just to be fair, we must acknowledge that the protein, even if very long, is highly modular, because it includes 12 “armadillo” modules, each of them about 40 AAs long. As I have already observed, repetition of modules in a protein should encourage some caution in evaluating the absolute homology values, because there can be a significant redundancy factor implied.

    It is also true that the 12 modules are rather different one from the other at sequence level. And practically all those differences are highly conserved too, which points to functional specificities of the differences.

  31. 31
    gpuccio says:

    bill cole:

    I checked the ExAC browser for beta catenin.

    The z score for missense variation is:

    z = 4.44

    And the probability of loss of function intolerance is:

    pLI = 1.00

    Both these values demonstrate that the protein is extremely functional, and that polymorphisms in humans are exceedingly rare.

  32. 32
    ET says:

    To Bill and gpuccio- I see the problem over on TSZ:

    If a duplication of this sequence with function A:
    AAGTCTCAT
    can evolve by mutation into this sequence and gain function B:
    GAGGCTCGT
    So that now we have both sequences with function A, and function B:
    AAGTCTCATGAGGCTCGT
    Then it would obviously be ridiculous to claim that no information has been added.

    It is ridiculous to claim such a thing was caused by blind and mindless processes. For one “waiting for two mutations” squashes such a concept for blind watchmaker evolution as there are several specific mutations that had to have occurred.

    For a duplicated gene you also need a new binding site. Then it has to be on a portion of DNA that is available to be expressed which is not guaranteed given the spooling of DNA in eukaryotes.

  33. 33
    bill cole says:

    gpuccio

    Thank you for the analysis.

    The z score for missense variation is:

    z = 4.44

    What is the scale for Z score?

  34. 34
    bill cole says:

    ET

    It is ridiculous to claim such a thing was caused by blind and mindless processes. For one “waiting for two mutations” squashes such a concept for blind watchmaker evolution as there are several specific mutations that had to have occurred.

    This whole concept is ridiculous. Were just waiting for the rest of the world to wake up and smell the coffee 🙂

    He is claiming that WNT ligand variants were created by gene duplication. Ok cool now for every WNT duplication you need a receptor duplication and co-mutation to match them. I am curious how far these guys are willing to stretch the “just so” stories.

  35. 35
    Mung says:

    It’s clear that they believe in miracles, but when you point that out to them, they ask you to define what you mean by a miracle.

  36. 36
    ET says:

    If it isn’t miracles it is definitely sheer dumb luck. Either way it isn’t science.

  37. 37
    bill cole says:

    ET Mung
    Define science 🙂

  38. 38
    Mung says:

    Science is the study of the miraculous in the natural world.

  39. 39
    bill cole says:

    gpuccio mung ET One for the ages 🙂

    DNA_Jock
    June 25, 2018 at 11:08 pm
    Ignored
    I think there is a subtlety here that could (conceivably) help colewd out of his endless confusion.
    Hazen and Szostak talk about the FI as defined for a specific function. We can imagine the sequence –> activity level surface for any specific function, and we can imagine calculating the activity level –> FI mapping that results, all the way from 0 FI to 4.322 x the length of the amino acid sequence. For every single function imaginable.
    For straightforward functions, like kcat-LDH and kcat-MDH, these surfaces and the resulting activity –> FI mapping are invariant.
    Which is a useful property if, like Hazen and Szostak, one is trying to understand the relationship between activity level and FI.
    The problem is when we start talking about Natural Selection (or Intelligent Selection, for that matter). Now there is a second mapping, from kcat-LDH and kcat-MDH (and the billions of other potentially selectable functions) to the unitary fitness surface for the sequence space. Because it is fitness that gets selected.
    Now, as a massive simplification, one could stipulate that for ATP synthases, fitness is a function of kcat/Km for ATP synthesis, and nothing else, and that it is a non-decreasing function of kcat/Km. Therefore (conveniently) FI for fitness is the same as FI for kcat/Km-ATPsynth. Hey, the better you synthesize ATP, the better off you’ll be…we can get behind that approximation, okay?
    But the mapping from kcat to fitness is not invariant; it is highly context dependent. It depends (Dawkins’s huge error in the Selfish Gene) on the other genes in that individual organism. It depends on copy number. It depends on a pile of context that varies over time and from individual to individual.
    So trying to parse movement over fitness surfaces in terms of FI is problematic.
    Instead, just think about the fitness surface, and how it relates to the kcat surfaces.
    At time zero, we have an enzyme that dehyrdogenates lactate. Cool. It’s okay at that. It also dehyrdogenates malate. Really badly. No-one really cares. Fitness is 1000 x kcat LDH + 1 x kcat MDH. The population of sequences is exploring this little valley where LDH matters a lot, and MDH hardly at all.
    The gene duplicates! Holy shit, what happens?
    The answer is that the fitness surface, for both copies, suddenly gets MUCH flatter. Mutations that would have been fatal yesterday are now mildly deleterious, such that they may fix. If that happens (the most likely outcome), then the fitness surface for the surviving copy reverts to what it looked like before the duplication. We are back where we started, but with a chunk of currently useless DNA added to the genome. No biggie.
    HOWEVER, sometimes one of the copies wanders across that flattened surface to an area of high MDH activity. That’s awesome, and MDH activity becomes an important new driver of fitness. And the new enzyme becomes better and better at using malate. Loses it’s LDH activity in the process. The fitness surface of the LDH copy changes back to something close to what is was originally (slightly different: catalyzing MDH has become irrelevant for that copy), and the MDH copy gets more and more specialized.

  40. 40
    DATCG says:

    24% …

    “Now UCSF scientists have revealed that, far from being a freeloader or parasite, the most common transposon, called LINE1, which accounts for fully 24 percent of the human genome, is actually necessary for embryos to develop past the two-cell stage.”

    If this holds up, how does Dan Graur’s assumptions of “Junk” DNA being 75% of the genome hold up?

    Darwinist have gone from 98% “JUNK” to 75% “JUNK” to how much ??% “JUNK” in the future?

    What of SINE? ALUs = 11% percent of human DNA? Important
    in neural activity of the brain.

    We live in amazing times. Good stuff Pav, Denyse 🙂

    So, occam’s razor – a whole lotta “needless junk” being preserved?
    Or, occam’s razor – a whole lotta “required info” being preserved?

  41. 41
    bill cole says:

    DATCG

    Darwinist have gone from 98% “JUNK” to 75% “JUNK” to how much ??% “JUNK” in the future?

    It is the neutral guys like Larry Moran the support the JUNK argument. JUNK is not part of the Neo-Darwinian model as selection should clean up the genome.

  42. 42
    DATCG says:

    Bill Cole,

    Darwinism originally included advantageous, deleterious and neutral changes if memory serves correct.

    Neo-Darwinist eschewed neutral mutations. Kimura reintroduced neutral changes – Neutral theory and championed it. I won’t go on to Nearly Neutral Theory

    I use Darwinist as a general tag. It’s easier than pointing to specific supporters like Moran or Dan Graur(where the 75% comes from).

    I guess I could use the general term Evolutionist, but that then includes ID scientist as well.

    In general, I use Darwinist. Because most worship at his altar. I guess if we desire to get detailed we say Neo-Selectionist today. Just does not have the same ring to it.

  43. 43
    DATCG says:

    Selectionist, Neo-Selectionist and Neutrals, Nearly-Neutrals that is. Or stick with Darwinist.

    http://www.pnas.org/content/104/20/8385

    The most famous sentence from The Origin of Species (1), “This preservation of favourable variations and the rejection of injurious variations I call Natural Selection,” suggests a dichotomy in the fate of “variations” and is generally interpreted accordingly. This sentence was immediately followed, however, by another one that still is only exceptionally quoted: “Variations neither useful nor injurious would not be affected by natural selection.”

    I.e., Darwin distinguished not two, but three kinds of variations: advantageous, deleterious, and neutral. Whereas advantageous variations expand in the progeny (by positive, or Darwinian, selection), the deleterious ones tend to disappear (by negative, or purifying, selection), and the neutral ones may come out of their limbo to be fixed (like the advantageous variations), or to disappear (like the deleterious ones). Incidentally, the concept of neutral variations is absent in Wallace (2).

  44. 44
  45. 45
    OLV says:

    Junk DNA?

    Junk DNA can play vital and unanticipated roles in the control of gene expression, from fine-tuning individual genes to switching off entire chromosomes. These functions have forced scientists to revisit the very meaning of the word “gene” and have engendered a spirited scientific battle over whether or not this genomic “nonsense” is the source of human biological complexity.

     

  46. 46
  47. 47
  48. 48
  49. 49
    gpuccio says:

    bill cole at #33:

    “What is the scale for Z score?”

    This is from the ExAC FAQ:

    “What are the constraint metrics?

    For synonymous and missense, we created a signed Z score for the deviation of observed counts from the expected number. Positive Z scores indicate increased constraint (intolerance to variation) and therefore that the gene had fewer variants than expected. Negative Z scores are given to genes that had a more variants than expected.”

    A z score of 4.44 corresponds to a two tailed p value of 8.995888e-06. It describes the probability of getting the observed deviation from expected values by chance.

  50. 50
    gpuccio says:

    bill cole, mung, ET, DATCG, OLV:

    I has already seen that there was some “debate”, especially from Rumracket I believe, at TSZ about the following paper:

    An atomic-resolution view of neofunctionalization in the evolution of apicomplexan lactate dehydrogenases

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4109310/

    Abstract:

    Malate and lactate dehydrogenases (MDH and LDH) are homologous, core metabolic enzymes that share a fold and catalytic mechanism yet possess strict specificity for their substrates. In the Apicomplexa, convergent evolution of an unusual LDH from MDH produced a difference in specificity exceeding 12 orders of magnitude. The mechanisms responsible for this extraordinary functional shift are currently unknown. Using ancestral protein resurrection, we find that specificity evolved in apicomplexan LDHs by classic neofunctionalization characterized by long-range epistasis, a promiscuous intermediate, and few gain-of-function mutations of large effect. In canonical MDHs and LDHs, a single residue in the active-site loop governs substrate specificity: Arg102 in MDHs and Gln102 in LDHs. During the evolution of the apicomplexan LDH, however, specificity switched via an insertion that shifted the position and identity of this ‘specificity residue’ to Trp107f. Residues far from the active site also determine specificity, as shown by the crystal structures of three ancestral proteins bracketing the key duplication event. This work provides an unprecedented atomic-resolution view of evolutionary trajectories creating a nascent enzymatic function.

    I thought that Rumracket’s “arguments” were so wrong that they did not deserve any comments. They just demonstrate that he does not understand anything about functional information. And we already knew that.

    However, as DNA_Jock has apparently commented on that too, and of course with much greater credibility, I feel that I should say a few things.

    I must say that I am rather busy at the moment, so I could not really read the paper with careful attention and detail. So, I apologize if there are any imprecisions in my comment, and I am ready to accept any reasonable corrections about that.

    The simple point is: the transition from MDH to LDH is not a very complex event.

    It requires, apparently, three independent events:

    a) A duplication of the gene

    b) The insertion of 6 not very specific AAs at a specific point

    c) A specific AA mutation

    OK, how complex is that?

    It’s not easy to calculate it exactly, but I would say something between 10 and 30 bits.

    The complexity of one specific AA mutation is about 4 bits. It is more difficult to evaluate the complexity of the gene duplication, and especially of the insertion. That’s why I give a rather wide range.

    However, we are very very distant from the 500 bit threshold.

    So, that transition is not the best scenario to infer design. It is, reasonably, in the range of what RV, in a population of single celled eukaryotes, could possibly achieve. Let’s say it is in a borderline zone, where the random transition is rather unlikely, but it cannot be excluded.

    So, it is completely absurd that Rumracket or others look at it as an example of emergence of new complex functional information.

    More in next post.

  51. 51
    gpuccio says:

    bill cole, mung, ET, DATCG, OLV:

    Now I will try to explain why in a case like the transition from MDH to LDH we must only consider the complexity of the transition, and not the complexity of the whole molecule. It should be rather obvious, but I will try to explain it with an example from language, so that maybe even Rumracket can understand it (not likely, I know!).

    Let’s start from the first phrase in the abstract of the mentioned paper:

    “Malate and lactate dehydrogenases (MDH and LDH) are homologous, core metabolic enzymes that share a fold and catalytic mechanism yet possess strict specificity for their substrates.”

    Now, let’s go immediately to the language example.

    Let’s say we have a population of 100 different people.

    And we have a phrase in English which conveys correct information about one of them, called Dave. IOWs, it is a phrase that is perfectly functional for Dave.

    Here it is:

    Phrase A:

    “The boy called Dave is from London, and he has brown hair and brown eyes. He is less than 30 years old.”

    The complexity of this sentence is about 505 bits, just enough to be “officially” considered complex.

    OK, now let’s say that we can transform the phrase by one single mutation (a rather likely event), so that it becomes:

    Phrase B:

    “The boy called Dale is from London, and he has brown hair and brown eyes. He is less than 30 years old.”

    If no boy called Dale exists in our population, or if his personal features are completely different, the new phrase will convey false information: it will not be functional.

    But let’s say that a Dale exists in our population, and that by chance he corresponds to the description (which, after all, is not so specific).

    Now, the new phrase is completely functional, but it refers to another individual.

    So, how complex is the transition from A to B?

    It’s really simple: it is about 5 bits complex (considering a 30 symbols alphabet).

    But Rumracket would say: “No, phrase B is complex, it is more than 500 bits complex. You have easily generated 500 bits of functional information!”

    Of course, that is completely false.

    The new functional information generated is simply the variation from Dave to Dale. The rest of the information was already there.

    Now, Dave and Dale are of course the two substrates: for example, malate and lactate. The existing phrase is the deydrogenase structure: the two enzymes, as said, “share a fold and catalytic mechanism”. That corresponds to the information in the phrase about the hair, etc.

    But the two enzymes also “possess strict specificity for their substrates”: in our example, that specificity is the name.

    The name is all that changes. And the change is rather simple. It is 5 bits in the language example, it is certainly more in the enzymes example, but not much more.

    The rest remains the same. The bulk of the functional information was already there, and is simply transmitted to the new molecule.

    It’s as simple as that.

    More in next post.

  52. 52
    gpuccio says:

    bill cole, mung, ET, DATCG, OLV:

    Now, what can I say of the argument by DNA_Jock as reported at #39?

    Well, I can only say that, for once, it seems very reasonable to me.

    DNA_Jock is very simply describing the best way that a rather simple (but not completely trivial) transition could have happened according to a neo-darwinian scenario.

    And I agree with him. The duplication + inactivation + RV scenario is certainly the most likely for a neo-darwinian mechanism. I have never really believed in scenarios based on single polyvalent intermediates, even for simple transitions. A duplicated inactivated gene is much more feasible for unconstrained (neutral) variation.

    But the question is: is that scenario credible?

    What I think is that it is probably a little too unlikely to be the best explanation, even at such a low functional information level as 30 bits. If I really had to bet, I would think it is designed. But just as a personal opinion.

    Because this is not certainly a scenario which is so unlikely that we have to reject the non design explanation. It could be a working explanation.

    As said, we are in a grey zone here.

    That’s why I never use the emergence of new functional variants inside a family protein as examples of design inference. They are definitely a grey zone.

    Axe has written a paper about that, a very reasonable paper. But not yet a definitive answer. I think we need further understanding to give an answer about the evolution inside protein families.

    But, of course, we have thousands of examples where the change in functional information is in the range of 500 bits or much more: for example, practically all protein superfamilies, or the many examples of transitions at the vertebrate level with hundreds or thousands of human conserved information jumps in a single protein.

  53. 53
    OLV says:

    OLV (46):

    Who Needs This Junk, or Genomic Dark Matter
     

     

    The questions related to the mysteries of constitutive heterochromatin are far from being resolved.

    The availability of TRs in the genome is characteristic for eukaryotes. So far we understand that: 1) the main portion of TRs is associated with interphase nucleus, in mice – with chromocenters; 2) TRs located in the euchromatin portion of the genome are likely an underlying morphogenetic program; 3) transcription of TRs is necessary for maintaining heterochromatin status of the HChr portion of the genome, but most important are the bursts of TR transcription accompanying normal embryogenesis and other stages of cardinal changes in the cell cycle including cancerogenesis; 4) the main hurdle in investigation of the role of TRs is lack of their satisfactory classification and annotation; up to the present time, TRs represent the “dark matter of the genome”.

  54. 54
    bill cole says:

    gpuccio mung

    What I think is that it is probably a little too unlikely to be the best explanation, even at such a low functional information level as 30 bits. If I really had to bet, I would think it is designed. But just as a personal opinion.

    Because this is not certainly a scenario which is so unlikely that we have to reject the non design explanation. It could be a working explanation.

    As said, we are in a grey zone here.

    That’s why I never use the emergence of new functional variants inside a family protein as examples of design inference. They are definitely a grey zone.

    Axe has written a paper about that, a very reasonable paper. But not yet a definitive answer. I think we need further understanding to give an answer about the evolution inside protein families.

    But, of course, we have thousands of examples where the change in functional information is in the range of 500 bits or much more: for example, practically all protein superfamilies, or the many examples of transitions at the vertebrate level with hundreds or thousands of human conserved information jumps in a single protein.

    I completely agree with you. Great analysis.

    The point Jock was really addressing was challenging my argument that you could deal with natural selection mathematically simply by including all selectable function in the numerator of the FI equation. He was claiming that there was no selectable path prior to gene duplication.

    My point back to him was simply that if gene duplication created a selectable path by reducing a selection constraint then a selectable path existed prior to gene duplication.

    The bottom line is there is no reason we cannot create a theoretical mathematical equation for FI that calculates 500 bits such that the design inference is confirmed as there are not enough resources for the trials required.

  55. 55
    Mung says:

    Thanks gpuccio.

    Are they misrepresenting your argument when they claim that there mere presence of 500 bits of FI is enough for you to infer design?

    It seems from your recent response that your scenario requires a transition that involves introducing more than 500 bits.

    So it’s not enough that some element merely has 500 bits of FI, that’s not enough on its own to lead to a design inference. AM I reading you correctly?

    cheers

  56. 56
    ET says:

    Corneel’s nonsensical bald assertion:

    But in this particular example all the required mutations (a gene duplication, a six-residue insertion and a handful of single-residue substitutions) are well within the power of random mutation and natural selection.

    Perhaps if given an infinite amount of time, an infinite population and very little death. Given the age of the earth and universe that scenario is out of luck

  57. 57
    gpuccio says:

    Mung at #55:

    It’s always a transition.

    Of course, in the case of a new protein which has no previous homologue (for example the appearance of a new protein superfamily), the transition is from some unrelated state to the functional state. In that case, the full functional information of the protein has been acquired in the “transition” that generates it for the first time.

    When, instead, new functional information is added to something that already had functional information, we can evaluate the complexity of the transition from a previous functional state to a new functional state.

    That is the case, for example, with the rather simple transition from MDH to LDH.

    But it is also the case with the complex specific engineering of already existing proteins that takes place at the transition to vertebrates, and that I have examined many times.

    There is essentially no difference between the scenario where a new functional protein arises “from scratch” (for example from some unrelated non coding DNA, or dsome unrelated duplicated gene), and exhibits, say 800 bits of functional information that did not exist before, or the scenario where an existing functional protein, which has maybe 300 bits of human conserved functional information in pre-vertebrates, acquires 800 bits of new human conserved functional information in cartilaginous fish.

    In both cases, we have a “transition” of 800 bits. In both cases we can safely infer design.

    I hope that clarifies the issue.

    However, these concepts are not mine at all. They were already clearly enunciated in Durston’s fundamental paper, for example.

  58. 58
    Mung says:

    gpuccio,

    Yes, that clarified it. Thanks!

  59. 59
    Mung says:

    gpuccio, could you clear up another matter for me please.

    Say you have a complex system composed of a number of parts, each of which when taken in isolation can perform some degree of some definable function.

    Say further that you can calculate the FI for each of these sub-systems.

    Do you believe that the proper way to determine the FI of the original system as a whole is to deconstruct it into parts, define a function for each of the parts, calculate the FI for each part, then add them all together?

    Thanks

  60. 60
    gpuccio says:

    Mung:

    In general, if a system is irreducibly complex, the functional complexity of all its individual parts multiplies (bits are added).

    So, if a system can only be functional if A, B and C are already present, and each of the three parts has no independent selectable function, then if A has complexity 300, B complexity 500, and C complexity 400, the functional complexity of the whole system will be 1200 bits.

    If, instead, each or some of the parts have an independent, selectable function, the multiplication rule does not apply, because the parts can come into existence and be selected for their final individual function. In this case, if that is true for all three parts, I would say that the functional complexity of the system is more or less equal to the functional complexity of the most complex part, in this case B. Plus, of course, any additional complexity that can be needed to make the three parts cooperate for the new meta-function.

    On the other hand, if for example A has an independent function, but not B and C, and the system is irreducibly complex because the meta-function needs all three parts, I would say that A does not contribute much to the total functional complexity of the system, and the real complexity will be 900 bits (B and C multiplied).

    A really complex system is often irreducibly complex: all parts are needed, and even if some of the functional information in the parts could already be present for independent reasons, there is often a lot of new functional information that has to be added, to all parts or to some of them, for the new meta-function to emerge.

  61. 61
    ET says:

    Discrete. Combinatorial. Object. DCO via “No Free Lunch”.

    The probability of any given DCO is the probability of the origin of each part x the probability of each part being in the right place at the right time x the probability of the proper configuration of the parts.

    The point is Mung’s approach only addresses the origins part of the equation but that would at least give a base figure.

  62. 62
    ET says:

    We talk about the odds of particular accumulations of mutations occurring but we seem to forget that those mutations also have to get by the proof-reading and error-correction processes.

    I know why evolutionists want to forget about that part but for us that just adds to the sheer dumb luck argument that exposes the anti-science nature of evolutionism.

  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
    gpuccio says:

    ET (and OLV):

    “The probability of any given DCO is the probability of the origin of each part x the probability of each part being in the right place at the right time x the probability of the proper configuration of the parts.”

    Of course. That’s why I wrote, at #60:

    “Plus, of course, any additional complexity that can be needed to make the three parts cooperate for the new meta-function.”

    That is indeed a big issue. Specific modifications to existing proteins that allow them to interact functionally with a different context are probably the explanation for much of the “re-engineering” of existing proteins that takes place, for example, at the vertebrate transition. The evolutionary conservation of that re-engineering is evidence of its functional specificity.

    The first paper quoted by OLV at #63 is absolutely pertinent! 🙂

  68. 68
    OLV says:

    gpuccio:

    Specific modifications to existing proteins that allow them to interact functionally with a different context are probably the explanation for much of the “re-engineering” of existing proteins that takes place, for example, at the vertebrate transition. The evolutionary conservation of that re-engineering is evidence of its functional specificity.

    Very interesting explanation indeed. Thanks.

  69. 69
    gpuccio says:

    OLV:

    I have read the paper you linked at #65 (the last one) about lcnRNAs. Very interesting, and a very recent update.

    So, the scenario that is becoming increasingly clearer is the following:

    a) Cells do transcribe most of their DNA. While coding RNAs are produced in great quantity and leave the nucleus to be used for translation, non coding RNAs are trancribed in relatively smaller amounts, and remain mostly in the nucleus.

    b) As the ENCODE people had imagined just from the beginning, this non coding RNAs are definitely functional. Their roles are being discovered day by day, and we are just at the beginning.

    c) lncRNAs are true and important regulators.

    d) Many of them are associated with chromatin: chromatin-enriched RNAs (cheRNAs)

    e) They have many roles in regulating transcription

    f) They have many roles in post-transcriptional regulation

    g) They have many roles in chromatin organization

    h) They implement their important roles both interacting directly with chromatin and interacting with many other important players, mostly proteins.

    i) They are derived from all parts of the genome: coding genes, non coding genes, and of course also repetitive elements, both SINEs and LINEs.

    IOWs, ENCODE people were completely right. Indeed, the role of regulatory non coding RNAs, of all types, is becoming more impressive than initially thought. By the day.

    That’s certainly part of the hidden “procedures”, coming slowly into the light. A set of new, amazing levels of functional complexity emerging, against the obstinate resistance of good old dogma-prone neo darwinists.

  70. 70
    OLV says:

    gpuccio:
    That’s an excellent summary assessment of the current and future state of this very important topic.
    Thanks.

  71. 71
    DATCG says:

    Excellent comments everyone! Good discussion.

    OLV, nice link on Humpty Dumpty. Had forgotten about that paper.

    Gpuccio, thanks as always for your summary and time 🙂

  72. 72
    DATCG says:

    We really need a JUNK-to-Function Tracker Site! 😉

    And another paper and post at ENV I enjoyed from earlier this year 🙂

    “There’s so much of the genome that we don’t understand, probably like 99 percent of it,” Dinger said. Seeing DNA folded like this in living cells “makes it possible to decode those parts of the genome and understand what they do.” [Emphasis added.]

    Wait, I thought Darwinist had it all figured out 😉

    Goodbye Central Dogma and Junk DNA

    Old-school geneticists considered this kind of DNA as “junk” or “selfish” DNA that perpetuated itself for no purpose, says Science Daily.

    But lead author Yukiko Yamashita and colleagues “were not quite convinced by the idea that this is just genomic junk.”

    For one thing, it is highly conserved, so “If we don’t actively need it, and if not having it would give us an advantage, then evolution probably would have gotten rid of it. But that hasn’t happened.”

    When they took a closer look, they found that cells in fruit flies, mice, humans and probably all vertebrates cannot survive without it. Using a protein named D1 that binds to the satellite DNA, they found it provides vital attachment points for molecular machines that keep chromosomes in the nucleus.

    Without it, DNA would float off into buds with only part of the genome, and the cell would die.

    The similar findings from both fruit fly and mouse cells lead Yamashita and her colleagues to believe that satellite DNA is essential for cellular survival, not just in model organisms, but across species that embed DNA into the nucleus — including humans.

  73. 73
    DATCG says:

    A quick review From 2012, ENCODE findings…

    Within this treasure trove of data, researchers found that more than 80 percent of the human genome has at least one biochemical activity.

    Although it is currently unknown whether all of this DNA contributes to cellular function, the majority can be transcribed into RNA. Furthermore, nearly 20 percent of the genome is associated with DNase hypersensitivity or transcription factor binding, two common features used to identify regulatory regions.

    Both of these measurements are a much higher percentage than the previous estimates that 5-10 percent of the genome was functional.

    Significantly, more than 4 million regions that appeared to be regulatory regions, or “switches,” were identified.

    These switches are important because they can be used in different combinations to control which genes are turned on and off, as well as when, where and how much they are expressed.

    Effectively, this provides precise instructions for determining the characteristics and functions of different cell types in the body. Changes in these regulatory switches, especially those regulating critical biological processes, can thus influence the development of disease.

    The astounding amount of gene-regulatory activity uncovered in the human genome is striking, as more of the genome encodes regulatory instructions than protein, and prompts an assortment of complex questions on how the genome is involved in health and disease.

    As a foundational information resource for biomedical research, the data put forth by the ENCODE Project is openly accessible and available through the ENCODE portal (http://encodeproject.org).

    In addition to the individual papers, results have also been organized along “threads” that explore specific scientific themes (www.nature.com/encode).

    This new approach of incorporating, organizing and presenting data from relevant sections of different papers, in different journals, helps to facilitate better user navigation through the immense amount of data and analyses generated.

    The ENCODE results are already influencing the way scientists are thinking about both new and existing data.

    For example, Thread #12 in the Nature ENCODE site focuses on the impact of functional information in understanding genetic variation within the human genome.

    Genome-wide association studies (GWAS) have previously been used to comb the genome for regions that are associated with specific human diseases or other traits.

    By comparing DNA sequences from hundreds to thousands of people either with or without a given disease, researchers have been able to identify regions containing variants that are associated with disease.

    Interestingly, more than 90 percent of these variants have been found in non-coding regions.

    However, because genetic variants within a given region may be linked to many other variants within the same region, it has been difficult to determine which variants have a causal contribution to increased disease risk.

    But when researchers compared the locations of non-coding functional elements identified by ENCODE with disease-associated genetic variants previously identified by GWAS, they detected a striking correlation between the two:

    genetic variants associated with diseases or other traits were enriched in regulatory switches within the genome.

    This is exciting because it provides an overarching framework for looking at many different diseases (including Alzheimer’s, diabetes, heart disease, and cancer) — and identifying the numerous genetic variants that cause them — beyond the context of DNA that code for proteins.

    Even outside its extraordinary scientific contributions, the structural model of the ENCODE Project is fundamentally changing the way large-scale scientific projects are being conducted.

    Resources such as the ENCODE analysis virtual machines (www.encodeproject.org/ENCODE/analysis.html) provide access to various stages of analysis, including input data sets, methods of analysis and code bundles. ENCODE software tools, data standards, experimental guidelines and quality metrics are all freely available at the ENCODE portal. This allows other researchers to independently assess and reproduce the data and the analyses — with a focus on scientific access, transparency and reproducibility — or to use similar methods to analyze their own data.

    To date, 170 publications from labs that are outside of ENCODE have used ENCODE data in their work on human disease, basic biology, and methods development (see: http://www.encodeproject.org/ENCODE/pubsOther.html). Through the establishment of a basic reference data set, along with accompanying analytical resources, scientists expect that further breakthroughs will be forthcoming in the upcoming years.

    Poor Dan Graur, what’s you gonna do as ENCODE comes for you.

  74. 74
    DATCG says:

    ENCODE at Nature…

    http://www.nature.com/encode/#/threads

    ENCODE project…

    https://www.encodeproject.org

    Updates, Data Releases and News…

    https://www.encodeproject.org/news/

    Candidate Regulatory Elements – Homo Sapiens…

    CRE Homo Sapien

    Integrative Level Annotations:

    https://www.encodeproject.org/data/annotations/

    “SCREEN: Search Candidate cis-Regulatory Elements by ENCODE…”

    http://screen.encodeproject.org

    Variant Annotation

    “Over the past decade, Genome Wide Association Studies (GWAS) have provided insights into how genetic variations contribute to human diseases. However, over 80% of the variants reported by GWAS are in noncoding regions of the genome and the mechanism of how they contribute to disease onset is unknown. By integrating data from the ENCODE project and other public sources, RegulomeDB and HaploReg are two resources developed by ENCODE labs to aid the research community in annotating GWAS variants. FunSeq is another ENCODE resource for annotating both germline and somatic variants, particularly in the noncoding regions of cancer genomes.”

    and onward… they’ve barely scratched the surface. This is critical for disease and aging. The scientist working on ENCODE have done great work.

  75. 75
    DATCG says:

    Thar’s gold in them thar “JUNK” regions. Just ask ENCODE DCC on twitter… Jobs, Jobs, Jobs…

    https://twitter.com/EncodeDCC/status/996859770521841664

    @EncodeDCC May 16 More…
    ENCODE DCC is looking for associate data wranglers! Please share to graduating students and those interested in databases/bioinformatics! https://app.joinhandshake.com/jobs/1613818/share_preview … via @joinhandshake

    Assistant Biocuration Scientist – Stanford School of Medicine

    Less than 2% of the human genome sequence is protein coding, have you ever wondered what is the role of the remaining 98% of the genome?

    uh, no, not Dan Graur or Darwinist stuck in Neutral.

    Stanford University Department of Genetics has an excellent opportunity for an entry-level Assistant Biocuration Scientist to play a role in the ENCODE project, advancing our understanding of the human genome and determining how genetics influences the development and progression of diseases. The ENCODE project is funded by National Institute of Health and is building a catalog of all the functional elements in the human genome sequence, subsequently making it available to scientists worldwide for the study of human health and disease.

    Shift out of neutral, hit the pedal and Go ENCODE.

  76. 76
    DATCG says:

    and this project…

    We show that PREDICTD data captures enhancer activity at noncoding human accelerated regions. PREDICTD provides reference imputed data and open-source software for investigating new cell types, and demonstrates the utility of tensor decomposition and cloud computing, both promising technologies for bioinformatics.

    PREDICTD PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition

    hmmm, the link, app is not working or timing out, but here’s the link at Twitter. Click on the link at the ENCODE Twitter message. It still may run a bit slow for some reason at NATURE link…

    https://twitter.com/EncodeDCC/status/984131983067377664

    Understanding how the genome is interpreted by varied cell types, in different developmental and environmental contexts, is the key question in biology. With the advent of high-throughput next-generation sequencing technologies, over the past decade, we have witnessed an explosion in the number of assays to characterize the epigenome and interrogate the chromatin state, genome wide. Assays to measure chromatin accessibility (DNase-seq, ATAC-seq, FAIRE-seq), DNA methylation (RRBS, WGBS), histone modification, and transcription factor binding (ChIP-seq) have been leveraged in large projects, such as the Encyclopedia of DNA Elements (ENCODE)1 and the Roadmap Epigenomics Project2 to characterize patterns of biochemical activity across the genome in many different cell types and developmental stages. These projects have produced thousands of genome-wide datasets, and studies leveraging these datasets have provided insight into multiple aspects of genome regulation, including mapping different classes of genomic elements3,4, inferring gene regulatory networks5, and providing insights into possible disease-causing mutations identified in genome-wide association studies (GWAS)2.

  77. 77
    gpuccio says:

    DATCG:

    Thank for your excellent contributions, as always! 🙂

    Yes, the subject is exciting and really, really in huge development.

    Here is an extremely recent overview, with special attention to the techniques that can be used in studying lncRNAs function:

    Platforms for Investigating LncRNA Functions

    http://journals.sagepub.com/do.....0318780639

    Abstract
    Prior to the sequencing of the human genome, it was presumed that most of the DNA coded for proteins. However, with the advent of next-generation sequencing, it has now been recognized that most complex eukaryotic genomes are in fact transcribed into noncoding RNAs (ncRNAs), including a family of transcripts referred to as long noncoding RNAs (lncRNAs). LncRNAs have been implicated in many biological processes ranging from housekeeping functions such as transcription to more specialized functions such as dosage compensation or genomic imprinting, among others. Interestingly, lncRNAs are not limited to a defined set of functions but can regulate varied activities such as messenger RNA degradation, translation, and protein kinetics or function as RNA decoys or scaffolds. Although still in its infancy, research into the biology of lncRNAs has demonstrated the importance of lncRNAs in development and disease. However, the specific mechanisms through which these lncRNAs act remain poorly defined. Focused research into a small number of these lncRNAs has provided important clues into the heterogeneous nature of this family of ncRNAs. Due to the complex diversity of lncRNA function, in this review, we provide an update on the platforms available for investigators to aid in the identification of lncRNA function.

    This final consideration is really telling:

    Upon the identification of regulatory ncRNAs, it has now become readily apparent that regulation of proteins has been completely underestimated. We are now only beginning to understand that signal transduction is dependent upon, as it appears to us today, an almost incalculable level of regulation within a signaling cascade if not at the individual protein level. This continuous dynamic regulation is necessary for cells to finetune external cellular signaling cues into appropriate transcriptional responses. The identification that loss-of-function mutations within ncRNAs contribute to the genesis and progression of human disorders further highlights their importance.

    Emphasis mine.

    It’s very interesting that presently the best method to target lncRNAs, and therefore to understand their functions, is through oligonucleotide-based strategies. And that’s exactly what was done in the paper about LINE derived RNAs which started this very interesting discussion.

    By the way, papers in Pubmed corresponding to a “lncRNAs” query were just 111 in 2012. Their number has been increasing at a very fast rate, and they are 1583 in 2017, and 1198 already in 2018. And a lot of them are about important and very interesting medical issues!

  78. 78
    gpuccio says:

    DATCG:

    Here is a very interesting database of human annotated lncRNAs.

    It includes at present 120,353 transcripts from 51,382 genes.

    What is really amazing is the complexity of those transcripts.

    Just as an example, look at the FIRRE gene.

    This is the brief summary from Wikipedia:

    Firre (functional intergenic repeating RNA element) is a long non-coding RNA located on chromosome X. It is retained in the nucleus via interaction with the nuclear matrix factor hnRNPU.[4] It mediates trans-chromosomal interactions[4] and anchors the inactive X chromosome to the nucleolus.[5] It plays a role in pluripotency[6] and adipogenesis.[7]

    But things are much more complex.

    The FIRRE region is located on chromosome X, and includes a sequence of about 140,000 bp.

    From that region, 22 different known lncRNA transcipts are derived, numbered from FIRRE 1 to FIRRE 22.

    FIRRE 1, for example, is 2928 bp long. But the interesting thing is that it is derived from 13 exons, dispersed thorugh the whole 140,000 bp region. So, it seems that the whole region is trancribed, and 12 very big introns are spliced before the FIRRE 1 transcript emerges.

    Of course, the 21 remaining FIRRE transcripts are derived from the same region, but through completely different transcriptional histories. The longest, for example, is FIRRE 12, 8415 bp long, made by 13 exons too. The shortest seems to be FIRRE 6, 259 bp long. Still long enough top be defined as a lncRNA, indeed (the conventional limit is at 200 bp).

    But are these weird transcripts functional?

    That seems to be the case.

    This is one of the many papers about FIRRE:

    The lncRNA Firre anchors the inactive X chromosome to the nucleolus by binding CTCF and maintains H3K27me3 methylation

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4391730/

    And this is very recent:

    The NF-?B–Responsive Long Noncoding RNA FIRRE Regulates Posttranscriptional Regulation of Inflammatory Gene Expression through Interacting with hnRNPU

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5672816/

    Ever new complexity…

  79. 79
    OLV says:

    gpuccio (77):

    “…papers in Pubmed corresponding to a “lncRNAs” query were just 111 in 2012. Their number has been increasing at a very fast rate, and they are 1583 in 2017, and 1198 already in 2018.”

    Interesting statistics.

    At what point could such an accelerated growth of research papers start to slow down?

  80. 80
    DATCG says:

    OLV @79,

    “At what point could such an accelerated growth of research papers slow down?”

    Great question! 🙂

    Considering this started around 2012 in full force? And
    approximately 90-95% was considered “JUNK” by Darwinist for decades of little research?

    They’re still discovering new information on how the 1-2% Protein coding genes function! Coordinate and organize.

    🙂 Haha!

    That’s why I think we need a Large JUNK to FUNC tracking Database site 😉

    This is going to go on for decades and smart scientist who desire to make a huge impact in Disease and Cancer treatments of all kinds are leaping into Epigenetic Regulatory Function Research on larger scales now.

    It’s bigger than any gold rush and we’re only seeing the beginning. The smart companies and scientist are going full throttle on this.

    Because Epigenetics disease research allows individual treatments as well.

    It’s like we’ve just discovered thars gold and now thars a rush on to buy permits. We’re in the very beginning of this process and it will be amazing for treatment of disease, even I would think aging and other factors.

    Edit: And just to add, they’re still designing and implementing systems to discover, monitor, organize and research these new patterns and integrations of “JUNK”! They’re at the phase of determining best how to mine all this treasure.

  81. 81
    DATCG says:

    Gpuccio,

    Thanks for links! 🙂 Yep, I’ve been following lncRNAs for a while now for my own reasons, along with circRNAs and many other Smart “Junk” 😉 on brain functionality.

    Great stuff! Thank you!

    OLV, Gpuccio,

    More than one “mechanism of action…”

    The paper Gpuccio linked @77 is dated June 26, 2018…

    However, to date, only a small percentage of these lncRNAs has been described in the literature, with an even smaller number being attributed to a specific mechanistic function. Furthermore, like proteins, many lncRNAs can employ more than one mechanism of action.

    So, what are scientist now and what will researchers find more of in the future? Note that this has become a Coding issue? To seek out patterns and reverse engineer them.

    My suspicion is, each pattern and placement will be much like any Data Coding requirements. The only way to track and control such complex systems is to intersperse the systems with IDentifiers, Tags, and other elements and patterns to regulate the building blocks of life. This is very much a Reverse Programming exercise now and has been for quite some time. But with the advent of regulatory functions, ENCODE and disease research, the true wealth of knowing how to reverse engineer Designed Code will be a fundamental aspect of research going forward. You’re no longer asking how something accidentally came together. They’re researching how these many Codes in the system controls, monitors, repairs and reacts to disease, stress, environments, food, etc., etc. Including how the many Codes organize and interact in a flow reaction process.

    Maybe we will see a new set of Rules based understanding of regulatory features, changes and reactions for the multiple Codes in the system.

    Cannot mutate this “JUNK” without severe consequences? Of disease, cancer, interrupted development processes?

    It’s amazing to see this research all unfold and supports Design. We live in great times of Code discovery and function!

  82. 82
    PaoloV says:

    DATCG:

    Very interesting information that you posted in your comments above. Thanks.

    Your comment at # 72 refers to a very exciting EN article that mentions the U. Michigan scientist Yukiko Yamashita, who said a few interesting things in this interview last year at the CSHL 82nd Symposium

    Here’s more info on Yukiko Yamashita’s work at the U. of Michigan, Ann Arbor

  83. 83
    Mung says:

    You guys need to stop. I’ve been buying up “junk-DNA” for next to nothing and you guys are going to do nothing buy drive up the costs. And there goes my retirement fund.

  84. 84
    OLV says:

    ‘Junk DNA’ therefore appears to participate in fundamental cellular processes across species, a result that opens up several new lines of research.

    A conserved function for pericentromeric satellite DNA

  85. 85
    OLV says:

    the non-coding nature and lack of conservation in repeat sequence among closely related species led to the idea that they are mostly junk DNA, serving no essential function (Walker, 1971; Doolittle and Sapienza, 1980).

    Instead, we propose that satellite DNA is a critical constituent of eukaryotic chromosomes to ensure encapsulation of all chromosomes in interphase nucleus. Our results may also explain why the sequences of pericentromeric satellite DNA are so divergent among closely related species, a contributing factor that led to their dismissal as junk.

    A conserved function for pericentromeric satellite DNA

  86. 86
    OLV says:

    DATCG (80):

    I like your idea on “Large JUNK to FUNC tracking Database site”. How would that look? Would that be a blog?

  87. 87
    DATCG says:

    Mung @83,

    Oooops, let the cat outta the bag! 😉

    from 2012…

    ““Cat herder in chief” of the ENCODE consortium of 400 geneticists from around the world”

    Thar be “hidden treasure” in that thar bag o’Junk…

    https://www.scientificamerican.com/article/hidden-treasures-in-junk-dna/

  88. 88
    DATCG says:

    #82 PaoloV,

    Thanks! I’ll check out the video when I have time. Yep, the article and paper are good JUNK 😉

  89. 89
    DATCG says:

    #84-85 OLV,

    Great! Yep… as it would be if you’re designing code, storage, retrieval, copying and replication processing!

    We do this routinely with all kinds of massive data today and code. We replicate, but tag, id and modify it on the fly. And the structures contain location data as well as modifier data often a few bytes down or maybe in another “domain” if you like.

    Thus the need for “random” access and Tagging of data. But it’s not really random. It only appears that way. Darwinist are still stuck in the past, using antiquated assumptions.

    It’s an architecture of interspersed elements. While this requires Splicing – this is much more advantageous than storing the needed IDs, Tags, Enhancers, Silencers in a whole other “table.” This bypasses inefficient methods of retrieval(when possible – not always) by storing Regulatory ID Elements(RIDErs) in direct contact with Readers and Editors for Direct Access, On the fly processing.

    So the Darwinist think-assume this is inefficient. Far from it, it’s highly optimized layers upon layers of Code and Data intertwined with each other. Bill Gates of Microsoft fully understood this as do many coders that had to work with complex storage requirements, modification and updates, retrieval and efficiency.

    This allows efficient manipulation, while conserving the original structures intact as required for preserving data and extending life by replication. It also allows a very convenient and efficient method of updating these elements with surrounding input data from the environment as it is encountered, or as stress and other issues are recognized by the myriad of systems interacting with each other.

    Once we can decipher these codes, the elements and architecture, the findings will be profound for Design Therapy of individual care. It’s already happening on small scales.

    We already know Epigenetics allows real time response to the environment. Which is not Darwinian, but controlled systems responses. Decoding the different elements and functions will allow much better response mechanisms to well known diseases, including variants of individual epigenomes.

    It appears Mung is going to be rich one day 😉

  90. 90
    DATCG says:

    OLV @86,

    Much of the work is being done in some ways by current Epigenetic researchers(see some links I posted above).

    But I guess I’m hinting at or hoping Discovery Institute 🙂 might consider accessing those databases, tracking and building a Junk to Func Listing as it continues to build.

    Maybe it can be done in generalized terms too of overall percentages and mega data.

    For example – ALUs, LINEs, etc., in groupings.

    Add up the percentages and compare to those specified by Darwinist who claimed it all to be JUNK, or now, reducing it to “at least” 75% in the case of Dan Graur.

    I think it would be an interesting project for DI to do.

    My thoughts are that once you find a few “JUNK” regions or groupings like ALUs have Function, most likely much if not all of them May have function.

    But Darwinist assume the direct opposite or that as they say, “just because you find function in a few ALUs, does not translate to ALL of them having function.”

    That’s an assumption on their part. But so is mine if Designed. Which assumption is correct, time will tell. And thus an idea to track it over time and see if the Function of former “JUNK” DNA regions surpasses Graur’s 75% threshold.

    If it does, that wall falls for him and others staking out a very strong claim. Though I suspect they might retreat and claim another threshold. Thus again, another good reason to track the Junk to Func progress 🙂

    Edit: I do want to be careful in my statements. I’m not stating that all of formerly declared regions of “JUNK” DNA will have function. Surely there may be fault-tolerance built in for duplicates and discarded remnants that eventually is released after non-use. But there’s a full system of monitoring and degradation built in to remove and discard unwanted data in the genome.

    I don’t know how much is functional. But suspect it’s higher than Dan Graur’s threshold of 75% Junk for eukaryotes, especially humans.

    And another important caveat. I consider Introns(some found functional already) as another argument for Design in use by the Spliceosome.

    Darwinist wrote these off long ago. But for a designed system of storage and retrieval from a Coders point of view. It makes all the sense in the world. I recognize these units as smaller elements and tools of the broader Regulatory System. They may seem like “junk” or repeating patterns and easily tossed aside. But for a Coder, we often use repeating elements and data patterns for different variety of targeted data retrieval, modifications, identifiers in exchange for pre-programmed responses. The introns are being read. Just becaused they’re discarded does not lead to non-functional. They include formal elements of utilization.

    This is like a table of functional queues built-in that reduces the need for hard-coding.

  91. 91
    gpuccio says:

    DATC:

    Again, excellent points at #89 and #90! 🙂

    A few personal thoughts:

    a) Conservation. In general, non coding RNAs are not very conserved. However, the degree of conservation can vary a lot between different forms. Satellite DNA, for example, is more conserved. lncRNAs are often scarcely conserved. Some are conserved in mammals, others are not. Some that are conserved in mammals are not conserved in other vertebrates. And so on. This amazing variety of conservation can mean many different things, as discussed in the following points.

    b) Sequence-structure-function relationship. We still understand very little about this, but it seems rather clear that the relationship between sequence, structure and function is very different in functional RNAs than it is for proteins. Structure seems to be fundamental for function, but it probably is less strictly related to sequence. That can allow for much greater flexibility of the sequence in functional RNAs, as it happens for proteins with low functional sequence specificity.

    c) Possible restricted specificity of function. If what we have been discussing here is even partially true, it seems very likely that lncRNAs and other forms of non coding RNAs are extremely important tools of transcription regulation. That would make them central in the procedures that define the specific nature of species, or of cell types. The procedure, or at least part of them. That mean, of course, that much of their “divergence” from species to species could well be functional divergence: that kind of difference that is there to do different things.

    d) Junk of junk? I absolutely agree with what you say about introns. But then what should we say of the “introns” of “junk” DNA?

    As I have discussed at #78, giving the example of FIRRE, many lncRNAs are polyexonic. For example, FIRRE 1 is made from 13 different exons.

    Moreover, as it happens for proteins, introns are much bigger than exons in lncRNAs too. In the case of FIRRE 1, the 13 exons correspond to “only” 2928 bp from a region of about 140,000 bp. The rest, of course, goes into the 12 introns.

    But, because, of alternative splicing, at least 22 different transcripts are derived form the same region, with different exon-intron diversification.

    So, what about these big introns in non coding genes? Are they the junk of junk?

    Or, as you will probably agree, are they even more sophisticated levels of functional regulation?

    e) Huge functional perspectives. As we have seen, lncRNAs are transcribed (and spliced) from vast regions of non coding DNA, often intergenic, repetitive or not, but also from coding sequences, often antisense, introns, or some mix of that all.

    As each lncRNA is, in the end, a specific selection of partial sequences (exons), reconstructed in a specific way by alternative forms of splicing, and as the function of those lncRNAs is probably connected to the final structure, more than to the strict sequence, it seems rather obvious that there are almost infinite functional potentialities. IOWs, a designer can “write” all sorts of functional RNAs, all sorts of regulatory tools, from the genome and its huge non coding territories, if he can control the way those partial sequences are transcribed, and above all spliced. IOWs, the whole genome, and above all its vast non coding spaces, are a potential repository for huge, almost infinite, regulatory chances, in the hands of an intelligent designer who can tweak their transcription and splicing.

    f) Finally, we must not forget that non codng DNA, and especially its transposable component, have probably an important role not only in regulating the individual, but in shaping the evolution of species, serving as powerful design tools in the creation of new genes, new proteins, new regulatory networks and so on, as suggested by an abundant amount of literature. The transposonic origin of many new protein genes, for example, has been proved in many cases, and transposons are now one of the best scenarios for the interpretation of evolutionary history. And, in the perspective of us IDists, of its designed origin. 🙂

  92. 92
    DATCG says:

    Gpuccio @91, thanks for the review…

    a) yep, and as regulator elements, to me at least this does not seem to surprising, which leads us to efficiency concepts later that might be reviewed the more we understand developmental aspects and species specific mechanisms. And how each interface or react for groupings and individual organisms within their environments

    b) agreed, but are you tying this in to Flexible Folding, or as usually tagged Intrinsically Disordered Proteins?

    c) Yep, yep and yep 🙂 It’s really cool and it’s really all about the programming, flexible programming made available by highly Modifiable Data Elements and what I’d call Environmental Pre-programmed Response Programs

    d) the Junk of Junk … more treasure?!?

    “As I have discussed at #78, giving the example of FIRRE, many lncRNAs are polyexonic. For example, FIRRE 1 is made from 13 different exons

    And I absolutely loved this I’m guessing you know!? It reminds me of our discussions under the Spliceosome and Intronic functionality, dual functions, etc. Yes, I agree, there’re more treasure to find I think of functional outcomes for these Introns. Essentially what we have is modular coding, the ability to absorb incoming messages like stress-related events or environmental queues in which certain aspect ratios, or triggers may drive alternative splicing outcomes.

    Again, this appears to be a Direct Access Method of Alternative Data Retrieval Dependency on Input or Program Recognition.

    It’s just too cool!

    e) hehe! yep! 🙂 OK, so what we’re discovering is incredible factors of a Direct Access Processing Tree driven by surrounding physical elements, internal elements and features elements that can be updated, changed and removed on the fly. And change from individual to individual down line or across groupings.

    Now, there are Critical Cores for survival and then there are Features – the Uniqueness – of different species, outcomes and “beauty” or “beast” lets say. It’s quite staggering and awe-inspiring to learn.

    Splicing and Introns/Exons an actual Must here. And a huge Advantage for multiple outcomes. Darwinist have made judgments in error I think that this amount of “JUNK” is bad design, prior to even knowing how it is utilized, called and functions.

    The process of life must do all of this physically while maintaining incredible accuracy, energy input, error-checking and removal of errors efficiently, rapidly and within narrow time frames or die off.

    It must also have pre-built Recognition factors, or none of this works. Location and Addressing.

  93. 93
    DATCG says:

    Gpuccio @91 continued…

    f) Finally, we must not forget that non codng DNA, and especially its transposable component, have probably an important role not only in regulating the individual, but in shaping the evolution of species,

    I agree up to this point above. I’m in an observation mode for the rest of your conclusion…

    “serving as powerful design tools in the creation of new genes, new proteins, new regulatory networks and so on, as suggested by an abundant amount of literature.

    Much literature suggest this. What I’m unsure of is in regards to Macro events.

    The transposonic origin of many new protein genes, for example, has been proved in many cases, and transposons are now one of the best scenarios for the interpretation of evolutionary history. And, in the perspective of us IDists, of its designed origin. 😉

    Can you please share a few research examples? I know there are many, but which papers do you recommend for Eukaryote macro evolutionary transitions? I’ll read and review as I have time. I’ve read some general papers on evolution and transposons and species.

    I’ve read material on retrotransposons fascinating stuff. For human genome they’re importance. Again, use to be thought of as “JUNK”.

    Thanks always for your contributions here and taking time for explanations. Really enjoy your input and detailed responses.

  94. 94
    DATCG says:

    Forgot to add this list of databases for lncRNA from Wiki…

    https://en.wikipedia.org/wiki/List_of_long_non-coding_RNA_databases

  95. 95
    DATCG says:

    As a follow up on Introns, came across this article tonight and find it fascinating. Not had time to read it all. From 2014…

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3989599/

    Requirement for highly efficient pre-mRNA splicing during Drosophila early embryonic development
    Leonardo Gastón Guilgur,1,2,3 Pedro Prudêncio,1,2,3 Daniel Sobral,1 Denisa Liszekova,1 André Rosa,1 and Rui Gonçalo Martinho1,2,3,*
    Elisa Izaurralde, Reviewing editor
    Elisa Izaurralde, Max Planck Institute Development Biology, Germany;

    What they discovered is Intron-less usage during early embryonic development since Splicing takes to long.

    Consistently, approximately 70% of early zygotic genes are small in size and intronless (De Renzis et al., 2007). As only 20% of Drosophila genes are intronless, it has been proposed that small intronless genes have an important selective advantage for transcription during the syncytial blastoderm formation (De Renzis et al., 2007).

    What I’m thinking is an inner core is protected(not much splicing if any), an outer core later is elevated and expressed(splicing) along with historic input of genes from a mother and father donors. This does not consider the many epigenetic factors during pregnancy that can improve upon or be deleterious to the offspring.

    This is way over generalized on my part.

    And I’ve not verified if this carries across to the human genome. Well, found a database with comparison…

    http://www.sinex.cl/statistics

    I find this fascinating and wonder if this limits major macro variation by some extent, but allows Edge-of-the-Genome variation.

  96. 96
    gpuccio says:

    DATCG:

    Here are a few recent papers:

    Transposons: a blessing curse.

    https://www.sciencedirect.com/science/article/pii/S1369526617301577?via%3Dihub

    The genomes of most plant species are dominated by transposable elements (TEs). Once considered as ‘junk DNA’, TEs are now known to have a major role in driving genome evolution. Over the last decade, it has become apparent that some stress conditions and other environmental stimuli can drive bursts of activity of certain TE families and consequently new TE insertions. These can give rise to altered gene expression patterns and phenotypes, with new TE insertions sometimes causing flanking genes to become transcriptionally responsive to the same stress conditions that activated the TE in the first place. Such connections between TE-mediated increases in diversity and an accelerated rate of genome evolution provide powerful mechanisms for plants to adapt more rapidly to new environmental conditions. This review will focus on environmentally induced transposition, the mechanisms by which it alters gene expression, and the consequences for plant genome evolution and breeding.

    Transposable elements shape the human proteome landscape via formation of cis-acting upstream open reading frames.

    https://onlinelibrary.wiley.com/doi/abs/10.1111/gtc.12567

    Abstract
    Transposons are major drivers of mammalian genome evolution. To obtain new insights into the contribution of transposons to the regulation of protein translation, we here examined how transposons affected the genesis and function of upstream open reading frames (uORFs), which serve as cis-acting elements to regulate translation from annotated ORFs (anORFs) located downstream of the uORFs in eukaryotic mRNAs. Among 39,786 human uORFs, 3,992 had ATG trinucleotides of a transposon origin, termed “transposon-derived upstream ATGs” or TuATGs. Luciferase reporter assays suggested that many TuATGs modulate translation from anORFs. Comparisons with transposon consensus sequences revealed that most TuATGs were generated by nucleotide substitutions in non-ATG trinucleotides of integrated transposons. Among these non-ATG trinucleotides, GTG and ACG were converted into TuATGs more frequently, indicating a CpG methylation-mediated process of TuATG formation. Interestingly, it is likely that this process accelerated human-specific upstream ATG formation within transposon sequences in 5′ untranslated regions after divergence between human and nonhuman primates. Methylation-mediated TuATG formation seems to be ongoing in the modern human population and could alter the expression of disease-related proteins. This study shows that transposons have potentially been shaping the human proteome landscape via cis-acting uORF creation.

    Genetic exchange in eukaryotes through horizontal transfer: connected by the mobilome.

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5791352/

    Abstract

    Background
    All living species contain genetic information that was once shared by their common ancestor. DNA is being inherited through generations by vertical transmission (VT) from parents to offspring and from ancestor to descendant species. This process was considered the sole pathway by which biological entities exchange inheritable information. However, Horizontal Transfer (HT), the exchange of genetic information by other means than parents to offspring, was discovered in prokaryotes along with strong evidence showing that it is a very important process by which prokaryotes acquire new genes.

    Main body
    For some time now, it has been a scientific consensus that HT events were rare and non-relevant for evolution of eukaryotic species, but there is growing evidence supporting that HT is an important and frequent phenomenon in eukaryotes as well.

    Conclusion
    Here, we will discuss the latest findings regarding HT among eukaryotes, mainly HT of transposons (HTT), establishing HTT once and for all as an important phenomenon that should be taken into consideration to fully understand eukaryotes genome evolution. In addition, we will discuss the latest development methods to detect such events in a broader scale and highlight the new approaches which should be pursued by researchers to fill the knowledge gaps regarding HTT among eukaryotes.

    This one is specially interesting:

    Protein coding genes as hosts for noncoding RNA expression.

    https://www.sciencedirect.com/science/article/pii/S1084952117300496?via%3Dihub

    Abstract

    With the emergence of high-throughput sequence characterization methods and the subsequent improvements in gene annotations, it is becoming increasingly clear that a large proportion of eukaryotic protein-coding genes (as many as 50% in human) serve as host genes for non-coding RNA genes. Amongst the most extensively characterized embedded non-coding RNA genes, small nucleolar RNAs and microRNAs represent abundant families. Encoded individually or clustered, in sense or antisense orientation with respect to their host and independently expressed or dependent on host expression, the genomic characteristics of embedded genes determine their biogenesis and the extent of their relationship with their host gene. Not only can host genes and the embedded genes they harbour be co-regulated and mutually modulate each other, many are functionally coupled playing a role in the same cellular pathways. And while host-non-coding RNA relationships can be highly conserved, mechanisms have been identified, and in particular an association with transposable elements, allowing the appearance of copies of non-coding genes nested in host genes, or the migration of embedded genes from one host gene to another. The study of embedded non-coding genes and their relationship with their host genes increases the complexity of cellular networks and provides important new regulatory links that are essential to properly understand cell function.

  97. 97
    DATCG says:

    Gpuccio,

    Thanks!

    So is it your opinion TEs and many of the regulatory elements evolved over time with respect to surrounding environments, organisms as well?

    Part of my issue is the usual chicken/egg scenario. What evolved first or did they progress all in conjunction?

    Did regulatory elements evolve in transition with other surrounding regulatory elements as environment grew and varied around them? And how does this all relate to the Cambrian Explosion of information?

    My other question is, why is it limited to a One-Time event?
    Surely if we had an event in the past of information explosion, then why do we see stasis now? Would we not find some evidence of macro-generative changes taking place around our globe? But what we find everywhere is equilibrium, balance, and the tips of branches extended as far as they can go. Just some thoughts.

    The organisms, bacteria, plants, eukaryotes, etc., seem very difficult to explain as symbiosis w/o pre-existing architecture. Bio-Codes built-in that can accept the input data from it’s surroundings. Not sure how you see this.

    Even then the complexity or ability to change and achieve a progression of increasing functional enhancements is quite a marvel of technology. I’m having a hard time wrapping my head around Designed macro-evolution because it seems like there are huge issues unless the functional relationships are all Prescribed. This is why I tend to think we have multiple bases of life, or bushes of life if you will, not a single Tree of Life.

    I realize these last few statements may sound weird, but in order lets say for an organism to respond to surrounding organisms, does it not need original regulatory features to react, digest and respond?

    I think it’s one thing for insertions of horizontal gene transfer, yet quite another for the host to recognize and develop the regulatory actions to utilize the new informational units, is it not? Especially if there’s some form of transition allocated to the human genome for example, or mammals in general.

    At least, this is where I scratch my head on some of the issues and resulting claims transferred to Eukaryotes as a possible solution.

    I tend to think there’s so much more we still do not understand in the Codes of Life.

    Thanks again, will enjoy reading these! 🙂

  98. 98
    gpuccio says:

    DATCG:

    “So is it your opinion TEs and many of the regulatory elements evolved over time with respect to surrounding environments, organisms as well?”

    Yes. By design, of course.

    “Part of my issue is the usual chicken/egg scenario. What evolved first or did they progress all in conjunction?”

    Probably more or less all in conjunction. That’s how design usually works.

    “Did regulatory elements evolve in transition with other surrounding regulatory elements as environment grew and varied around them? And how does this all relate to the Cambrian Explosion of information?”

    I don’t think environment was the only factor, probably not even the most important.

    I think what we observe is the gradual development of a designed plan, of which environment is a part.

    The Cambrian explosion is a critical step where the body plans of the different phyla of metazoa are generated. By design, of course.

    My idea is that transposons are an important tool in that design plan and design process. Because we observe a transposon signature in many of the functional novelties that characterize species.

    “My other question is, why is it limited to a One-Time event?
    Surely if we had an event in the past of information explosion, then why do we see stasis now? Would we not find some evidence of macro-generative changes taking place around our globe? But what we find everywhere is equilibrium, balance, and the tips of branches extended as far as they can go. Just some thoughts.”

    But, of course, stasis is prevalent throughout natural history. Gould understood that very well.

    We have the (probable) very long stasis between prokaryotes and eukaryotes (even if we don’t really know when and how eukaryotes appeared).

    We have the quite certain stasis between single celled eukaryotes and metazoa (either the Ediacaran or the Cambrian explosion).

    Vertebrates emerge rather quickly from deuterostomia about 100 million years after the Cambrian explosion.

    And so on.

    It seems that design is rather sudden in natural history, often confined to specific boosts.

    I have no idea if design is implemented gradually, or in rather sudden events. Many facts are in favor of the second option, but again only facts must guide us.

    If design happens rather suddenly, it could happen in one million years, or one thousand, or one year. Maybe one day, or one second. Everything is possible, and again only facts must guide our understanding.

    For the moment, I like to think that design happens in rather sudden critical events, like the Cambrian explosion, but that it takes some time anyway.

    It is also likely that minor events of design happen more gradually and more constantly.

    And, of course, it is possible that minor minor events of adaptation, either random or according to pre-designed algorithms, contribute to the whole process.

    In all cases, our window of direct observation (hundreds of years) is certainly too small to give us any reasonable probability to witness major design events. We are probably only observing stasis.

    Luckily, however, we can well observe things indirectly.

    “The organisms, bacteria, plants, eukaryotes, etc., seem very difficult to explain as symbiosis w/o pre-existing architecture. Bio-Codes built-in that can accept the input data from it’s surroundings. Not sure how you see this.”

    I don’t think symbiosis has a major role in all that. Maybe only occasionally (mithocondria and plastids are the best example).

    “Even then the complexity or ability to change and achieve a progression of increasing functional enhancements is quite a marvel of technology.”

    It is.

    “I’m having a hard time wrapping my head around Designed macro-evolution because it seems like there are huge issues unless the functional relationships are all Prescribed.”

    Why shouldn’t they?

    “This is why I tend to think we have multiple bases of life, or bushes of life if you will, not a single Tree of Life.”

    It’s possible, but not really necessary.

    Let’s take the Cambrian explosion, for example.

    The phyla are certainly different and complex designs. Each of them individually programmed.

    But we have good reasons to believe, form facts, that all of them use a shared bulk of functionality which had already been designed in single celled eukaryota.

    IOWs, both chordata and sponges, just to be clear, are built using similar eukaryotic cells. Even if the paln is completely different.

    So, we have two possibilities:

    a) An original common ancestor of metazoa originated from single celled eukaryotes, and then quickly was differentiated (by design) into the differnt phyla.

    Or:

    b) Each phylum emerged by individual design from single celled eukaryotes.

    OK, both scenarios are possible, and probably at present I would favor b), because I am not sure we have convincing evidence of a common ancestor of metazoa.

    But my point is: in both cases we have a tree, and not only “bushes”. Even if there was not a common ancestor of metazoa, single celled eukaryotes are certainly a common ancestor.

    There is a lot we don’t understand. For example, I am really fascinated by the Ediacaran explosion, and I would really like to know what those beings were, and any detail about their biological information.

    Who knows, maybe in the future?

    “I realize these last few statements may sound weird, but in order lets say for an organism to respond to surrounding organisms, does it not need original regulatory features to react, digest and respond?”

    Of course. They are certainly part of the design. Even admitting a small role for adaptation, as I have said before.

    “I think it’s one thing for insertions of horizontal gene transfer, yet quite another for the host to recognize and develop the regulatory actions to utilize the new informational units, is it not? Especially if there’s some form of transition allocated to the human genome for example, or mammals in general.”

    Of course. When I say that transposons (or HGT) are tools of design, I mean exactly that: they are used by the designer in a greater context.

    “I tend to think there’s so much more we still do not understand in the Codes of Life.”

    And I absolutely agree! 🙂

  99. 99
    DATCG says:

    Gpuccio,

    As usual, your response is detailed 🙂 Thank you for your time and thoughts.

    I have been and for now in a “I don’t know” or “I need more evidence” position. I’m not ruling anything out, but remain unconvinced so far from what I’ve read in the past.
    Been in a holding pattern for quite a while as I continue to parse thru different papers and information.

    I will read the papers you recommended for sure!

    I agree with many of your comments on stasis, Gould’s insights, etc. I saw him as being honest with historical record and data of starts and stops, explosions of information. Even if not convinced of the mechanisms.

    Certainly on programmed design, function, data and sharing we agree upon. Design can make it all possible. I don’t doubt that it could be done, just that I’m uncertain is what done that way – single common ancestor.

    I can’t see it yet in my mind equating to macro-events over time from a single Tree of Life.

    I know that mine is not a popular position admitting “I’m uncertain” of macro-evolution from a single TOL.

    I’m not concerned about time, except that in terms of unguided events. I see no way unguided, blind mutations have the time to produce the diversity we see. Which I think you and I agree on?

    Surely Design can “boost” as you say the genomes, the speed and unfolding processing as it likes as fast or at whatever pace fits the surrounding environments. And of course I agree on allowing Facts guide us.

    Two explosions 😉 Ediacaran and Cambrian. Darwinist are in a twist, whatever will they do?

    But because macro events of the past are unobserved and are inferred, I wait, read and digest information as discovered, discussed and debated by all sides.

    The ancestry can be weaved in many different ways depending upon researchers and their methods(see Gunter Bechly below). So, I enjoy reading and learning, but do wonder how we can ever untangle it all and be certain of so many assumptions in the past even with insights of genetics in the 21st century. As the experts argue, change and move different pieces of the puzzle around and they often disagree. A hundred million years gone, whooosh, or a fossil moved backward, forwards and sideways in time as each expert weighs in. It all seems a lot like throwing darts blindly at times.

    Also, an aside, or throwing another wrench into the subject 😉

    I tend to agree with Dr. Sanford at least on a deleterious genome, epigenome, DNA perspective that we’re running down over time with more mutations, not upward bound. As a general overall big picture.

    I try to keep abreast of Third Way researchers like Shapiro and Koonin, Noble, etc. Koonin especially as he bucks the trends at times in the past and found him to be refreshing with some of his conversations on Darwin’s TOL, when others were not watching to close. He was quite excited in several discussions with other Darwinist on the subject.

    Here’s a paper, rather dated from 2010, but enjoyed at time reading it.

    https://www.nature.com/scitable/topicpage/the-two-empires-and-three-domains-of-14432998

    And now that Gunter Bechly was added to the Discovery Team, enjoying reading his articles from time to time…

    https://evolutionnews.org/2018/06/rafting-stormy-waters-when-biogeography-contradicts-common-ancestry/

    So, I remain watchful and will continue to learn! 🙂

  100. 100
    DATCG says:

    And forgot this one, paper from 2017 on insights and ideas on a “reshaping Darwin’s Tree of Life… A bit of hyperbole, but interesting way to look at it I think even from an overall Design perspective. This is in PDF downloadable form at the link below…

    https://www.researchgate.net/publication/317495996_Reshaping_Darwin's_Tree_Impact_of_the_Symbiome

  101. 101
    DATCG says:

    Oh and speaking of Koonin, I guess a bit off subject, but found this quote at his page on Third Way…

    Gone, gone… Hsta la Vista Baby, Modern Synthesis is gone…

    “The summary of the state of affairs on the 150th anniversary of the Origin is somewhat shocking: in the post-genomic era, all major tenets of the Modern Synthesis are, if not outright overturned, replaced by a new and incomparably more complex vision of the key aspects of evolution. So, not to mince words, the Modern Synthesis is gone. “

    (The Origin at 150: is a new evolutionary synthesis in sight? Trends Genet. Nov 2009; 25(11): 473–475)

    Poooof!!! 😉

    As I continue to see statements like this, it gives me cause for pause to step back and watch, observe as more data and information comes in. If such major constructs are vanished, then what of many other major assumptions?

    Anyways, stunning admissions! And welcome by Koonin.

  102. 102
    gpuccio says:

    DATCG:

    “I’m not concerned about time, except that in terms of unguided events. I see no way unguided, blind mutations have the time to produce the diversity we see. Which I think you and I agree on?”

    Absolutely! 🙂

    “I know that mine is not a popular position admitting “I’m uncertain” of macro-evolution from a single TOL.”

    I am uncertain of a lot of things. Not of design.

    I have no idea if there is one TOL, or many. But at present I cannot accept that there are no TOL at all. Too many facts are against that idea.

    For example, I cannot accept, with the facts we know, that metazoa were “created” from scratch, and not using existing single celled eukaryotes. And so on.

    I do believe that prokaryotes have been used to design mitochondria and plastids. The rest? I don’t know. But it is true that a lot of information in eukaryotes seems to derive from prokaryotes too.

    I don’t know why so many people here in the ID filed seem to have a need to deny any form of descent. Maybe it’s only a religious position. However, I cannot agree. Descent is there, and it can be observed. Of course, it does not explain functional novelties. But it explains conserved functionalities, and the random degradations that can be observed in them.

    “I tend to agree with Dr. Sanford at least on a deleterious genome, epigenome, DNA perspective that we’re running down over time with more mutations, not upward bound. As a general overall big picture.”

    That’s true. If it were not for design, functions would simply go down, r at most be conserved by some protection procedures. There is no doubt that unguided variation can only degrade function.

    But that is indeed the strongest argument for descent. We see the gradual degradation, time dependent degradation, in functional (or non functional) structures that are passed on. Neutral variation is everywhere to be seen.

    I absolutely agree that there are a lot of difficulties with TOL, and Bechly is perfectly right to outline some of them. But that there are difficulties and contradictions does not mean that the idea is completely wrong. That happens all the time in science.

    The only thing that I am completely sure of is the fundamental role of design in biology: the evidence for that is so overwhelming that it is really astounding for me that a lot of intelligent people can deny it.

  103. 103
    OLV says:

    gpuccio and DATCG have practically taken over this discussion, at a pace that is hard to catch up with the information avalanche posted in their interesting comments.
    For example, the referenced papers seem very attractive to anyone who seriously tries to understand what’s going on in biology research these days.
    Thanks!

  104. 104
    OLV says:

    DATCG (101):

    you referenced a very interesting paper by E. Koonin that is almost a decade old.

    The 200th anniversary of Charles Darwin and the 150th jubilee of the “On the Origin of Species” could prompt a new look at evolutionary biology. The 1959 Origin centennial was marked by the consolidation of the modern synthesis. The edifice of the modern synthesis has crumbled, apparently, beyond repair. The hallmark of the Darwinian discourse of 2009 is the plurality of evolutionary processes and patterns. Nevertheless, glimpses of a new synthesis might be discernible in emerging universals of evolution.

    The biological universe seen through the lens of genomics is a far cry from the orderly, rather simple picture envisioned by Darwin and the creators of the Modern Synthesis.

    The Origin at 150: is a new evolutionary synthesis in sight?

    Koonin 2009 (full text) PDF

  105. 105
    DATCG says:

    OLV @103,

    I’m but a simple coder/debugger trying to follow the deciphering of the ultimate of code(s) of “Life, the Universe and Everything” 😉 and Gpuccio to me is a great teacher along with many others here at UD past and present. I’m simply learning as I go through this “avalanche” as you term it of information overflow 😉

    Gpuccio is always patient, kind even with those who often are abusive to him from the Darwinist side and has always been gracious to me as I stumble along the way. I’ve learned a lot from him.

    @104, Koonin, always enjoy his work, because he at least openly admitted some time ago the deficiency of neo-Darwinism and the problems at the base of Darwin’s TOL. I wish I could find a paper he once wrote(or maybe he was a reviewer), at that time recognized the significance of the paper’s findings to Darwin’s TOL. He was a bit berated for saying so by the actual paper’s authors. Think Doolittle or Baptiste was one of the authors with an online review and discussion. I have not been able to find it.

  106. 106
    DATCG says:

    Gpuccio @109,

    Thank you for your patience and outlining different thoughts to mine.

    “I am uncertain of a lot of things. Not of design.”

    We agree 🙂 Absolutely! I embrace uncertainty at times, maybe to much. But after being to certain in the past, I am attempting to learn a bit of patience.

    “I have no idea if there is one TOL, or many. But at present I cannot accept that there are no TOL at all. Too many facts are against that idea.”

    I’m not against TOL(s), just uncertain as to how many or how they unfolded over time.

    “For example, I cannot accept, with the facts we know, that metazoa were “created” from scratch, and not using existing single celled eukaryotes. And so on.”

    I’m guessing nothing is from “scratch” per say from a Designer’s point of view?

    I can see how seeds are easily advanced all over the earth and plants of many varieties and can vary over time. In that area I don’t struggle or with others like bacteria and viruses, as much as I do with metazoa.

    “I do believe that prokaryotes have been used to design mitochondria and plastids. The rest? I don’t know. But it is true that a lot of information in eukaryotes seems to derive from prokaryotes too.”

    This is where I am in the “I don’t know” position and mainly I need to read more before I’m convinced.

    “I don’t know why so many people here in the ID filed seem to have a need to deny any form of descent. Maybe it’s only a religious position.”

    Maybe, but I was not a believer or religious in the past and cannot speak for others. I don’t think Creationist have yet to fundamentally put forth enough strong arguments for their case. Though I do follow their ideas, discussions as I do with Darwinist, IDist, etc.

    “However, I cannot agree. Descent is there, and it can be observed.”

    OK, I’m thinking over long period of time with major evolutionary changes of many mechanisms that cannot be observed today or over modern era and must be inferred by evolutionist, the fossil records and genetics.

    For example quadroped to whale. Leaving out the initial complex stages from ocean to land as a mammal. While I’ve reviewed different works, papers, research and opinions, I’m not convinced yet of the sequences of events and/or the complex informational exchanges that took place over time. Time is not an answer, it’s a block(unfortunately without adequate information) to our full understanding of the events that took place over millions of years, in my opinion. So we can infer as evolutionist do that certain major steps, novel functions are created and events occur. But I do not see them as fact, not so far. But I do think Design obviously can make a case for such guided, complex informational transitions whereas unguided, blind events do not and cannot.

    Maybe as technology progresses I can see a path forward in unfolding guided events.

    So, I’m not fully against it, but need more info, more evidence. Until then, I remain in the “I don’t know” category.

    “Of course, it does not explain functional novelties. But it explains conserved functionalities, and the random degradations that can be observed in them.”

    Oh, this is really good. Thanks Gpuccio. Yes, agree on “does not explain functional novelties.” my sticking point. And I do agree about “conserved functionalities” and even “random degradations… observed in them.”

    “That’s true. If it were not for design, functions would simply go down, r at most be conserved by some protection procedures. There is no doubt that unguided variation can only degrade function.”

    Another area of agreement 🙂

    “But that is indeed the strongest argument for descent. We see the gradual degradation, time dependent degradation, in functional (or non functional) structures that are passed on. Neutral variation is everywhere to be seen.”

    And I agree with all you said, with exception of first line – Again, I simply don’t know, or maybe do not comprehend(?) in terms of larger macro events over time(see example of whale above).

    “I absolutely agree that there are a lot of difficulties with TOL, and Bechly is perfectly right to outline some of them. But that there are difficulties and contradictions does not mean that the idea is completely wrong. That happens all the time in science.”

    I agree with your statement in that corrections are made all the time in science and yes, this does not rule out guided, or prescribed evolution by Design. I just admire Bechly’s work in holding others accountable, much like Koonin, even though Koonin is an avowed Third Way evolutionist.

    “The only thing that I am completely sure of is the fundamental role of design in biology: the evidence for that is so overwhelming that it is really astounding for me that a lot of intelligent people can deny it.”

    We are in complete agreement Gpuccio 🙂 And I’m not ruling out anything else at this point, but simply remaining in “neutral” on macro events 😉

    Thanks! Oh… and a follow up next post on introns!

  107. 107
    DATCG says:

    OK, so this is from Evolution News again,

    Encryption System Found in Genes – Psssst Introns Inolved!
    😉

    https://evolutionnews.org/2018/07/encryption-system-found-in-genes/

    Introns fascinated me long ago. So here’s more news from our friends. A copy of some of the relevant info…

    RNA is composed of four bases (abbreviated A, U, G and C), thereby disseminating its message with a fairly simple code. In recent years, research has shown an unprecedented impact of RNA modifications at all steps of the maturation process. More than a hundred RNA modifications have been identified with roles in both inhibiting and facilitating binding to proteins, DNA and other RNA molecules. This encryption by RNA modification is a way to prevent the message of the RNA in being read by the wrong recipients.

    The research-team has focused on the RNA-modification m6A and shown that RNA can be labeled with this modification while being copied from DNA…. The results demonstrate that an m6A positioned at an exon next to an intron increases the RNA maturation process, while m6A within the introns slows down the maturation of RNA (Figure 2).

    And the reason I’m so excited about this article follows, as I’ve haphazardly tried to explain before in comparisons of storage media, data retrieval and modifications…

    It has long been a mystery why genes code for stretches called introns that are translated but then cut out afterwards. Why are they there? Here’s where the findings get really interesting. Introns appear to help scramble the message, fulfilling the encryption role, but they do something else: they regulate how the exons will be assembled.

    Yes! 🙂 This is precisely what I’ve been excited about and hoping to find. Mentioned similar thoughts in the Spliceosome and maybe the Ubiquitin post you made Gpuccio.

    That Introns might be regulators of data. Though I’m not sure I articulated well. This is amazing design evidence! As a heuristic study, this is what I’ve often said, if it’s designed, then reverse engineering with Design in mind may uncover more Intron function.

    We look at data storage a bit differently with our experiences. We see it as a solid state function embedded in hard drives for example with location information that can be directly accessed to more data in different tracks(think – domains) that eventually pulls together all the data into one long string(or associated strings) that is to be modified or sent as is to other functions within a program of sub-programs, that are eventually tied to events, creations and modifications of outcomes, whether as Documents or any kind of output today from Media to 3D printing.

    So, thinking of Introns as pre-designed storage tracking modules that link together, modulate and refine Exons 🙂

    continuing…

    “The paper in Cell Reports explains how it works:”

    Here, we provide a time-resolved high-resolution assessment of m6A on nascent RNA transcripts and unveil its importance for the control of RNA splicing kinetics. We find that early co-transcriptional m6A deposition near splice junctions promotes fast splicing, while m6A modifications in introns are associated with long, slowly processed introns and alternative splicing events. In conclusion, we show that early m6A deposition specifies the fate of transcripts regarding splicing kinetics and alternative splicing.

    Too cool! 🙂 This is beeeeyeutiful… and includes If-then thinking…

    Now, according to the scientists at Aarhus, the specific position of the m6a mark appears highly relevant not only to the type of messenger RNA produced — and thus the protein to be translated — but also to the rate of production. If the m6a mark is placed near a splice junction, the constitutive transcript is produced quickly (i.e., exons are arranged in the order they were transcribed). If the mark is placed on an intron, it slows down the splicing, and might produce a completely different transcript with a different protein resulting. Is this a method to achieve cell-specific regulation?

    Our January 2017 article spoke of the m6a process as a kind of “if-then” algorithm: i.e., if this gene is found in a muscle cell, transcribe it this way; if found in a nerve cell, transcribe it another way, and so on. For this to work, the gene must embed the key in its introns, and the associated m6a marker must know the key to arrange the transcript accordingly. The researchers found that 57 percent of the markers were found on introns, and another 9 percent are on untranslated regions. Only 22 percent were found in coding regions.

    Now Gpuccio, back to our earlier discussions on macro-evolution. I’m more than willing to contemplate that Splicing could be a key, along with Introns as control factors and Key-Engrpytion variation for unlocking unfolding events. That seems feasible. My problem(s) with macro events in Metazoa have always been the Mechanisms required for such unfolding events.

    Sorry for “jumping” way off topic 🙂

  108. 108
    DATCG says:

    And here’s the previous article mentioned from Jan 17 which offers some background and somehow I missed. Awesome! 🙂

    https://evolutionnews.org/2017/01/cornell_researc/

    What we see here is another Signature in the Cell. Intelligent design advocates are not surprised to find codes and switches in irreducibly complex systems. In fact, we expect that this finding will stimulate the discovery of additional codes, such as those that decide which mRNA transcripts should be treated as more important than others.

    Darwinian evolution, by contrast, has a big challenge in explaining how multiple players mutated together by chance to hit upon a language convention. What do unguided, blind processes know about codes? What do they understand about information? In short, nothing.

    Indeed, a blind, unguided series of Lemony Snicket events does not have a clue.

  109. 109
    DATCG says:

    The paper on Introns and splicing processing speeds on Alternative Splicing…

    https://www.cell.com/cell-reports/fulltext/S2211-1247(18)30858-1

    GPuccio, cross-posting at your original Spliceosome post…

    https://uncommondescent.com/intelligent-design/the-spliceosome-a-molecular-machine-that-defies-any-non-design-explanation/

  110. 110
    gpuccio says:

    DATCG:

    Again, excellent points.

    I think we agree on almost everything! 🙂

    Regarding the general problem of derivation of metazoa, there are only three things that I am rather sure of:

    a) It happens (new phyla, species and so on do appear at discrete times).

    b) It happens by design (this is the most certain thing of all).

    c) In some way, existing information is physically re-used in the process, and it carries with itself the inherent degradation that has already taken place in the past (that’s what I mean by “descent”).

    All the rest is completely open, because the facts we have are still too limited. I too hope that the future will help us understand better.

  111. 111
    gpuccio says:

    DATCG:

    Very interesting paper about m6A.

    Now, the process is only one of more than a hundred similar modifications. From the paper:

    The RNA nucleotide code is supplemented by more than a hundred chemical modifications, greatly extending the functionality and information content of RNA (Fu et al., 2014, Harcourt et al., 2017).

    m6A is implemented by a number of important proteins:

    METTL3 580 AAs in humans

    METTL14 456 AAs in humans

    WTAP 396 AAs in humans

    Protein virilizer homolog 1812 AAs in humans

    Two demethylases: FTO (505 AAs in humans) and ALKB5 (394 AAs in humans)

    The most specific component is probably the Protein virilizer homolog, a 1812 AA long protein. From Uniprot:

    Associated component of the WMM complex, a complex that mediates N6-methyladenosine (m6A) methylation of RNAs, a modification that plays a role in the efficiency of mRNA splicing and RNA processing (PubMed:24981863, PubMed:29507755). Acts as a key regulator of m6A methylation by promoting m6A methylation of mRNAs in the 3′-UTR near the stop codon: recruits the catalytic core components METTL3 and METTL14, thereby guiding m6A methylation at specific sites (PubMed:29507755)

    Again, the problem of control levels arises: if, as it seems, m6A contributes significantly to regulate transcription differently in different contexts, using introns as a parallel level of information, what tells the system of proteins that implement m6A what and when and how to methylate?

    It’s a continuous generation of new meta-levels: fascinating, and probably a little bit frustrating! 🙂

  112. 112
    gpuccio says:

    DATCG:

    By the way, Protein virilizer homolog is also the protein in the group that undergoes the biggest information jump at the vertebrate transition:

    0.7621413 baa

    1381 bits

  113. 113
    DATCG says:

    Thanks Gpuccio for your summation @110.

    And your always magnanimous attitude in an area I’m still learning and will for life. 🙂

  114. 114
    DATCG says:

    Gpuccio @111-112,

    Ooooo… thanks for that info!

    By the way, your post on utilizing BLAST. I used it some time ago, but unable to locate it again.

    Maybe, if OK with UD, think we should post your BLAST instructions at least as a Comment under Resources link? Or add it to the current list.

    I’d saved those links in the past of your initial Blast posting(s), but moved around since and lost them.

  115. 115
    DATCG says:

    Gpuccio @111,

    “Again, the problem of control levels arises: if, as it seems, m6A contributes significantly to regulate transcription differently in different contexts, using introns as a parallel level of information, what tells the system of proteins that implement m6A what and when and how to methylate?”

    Smiles great insight and question! Might I speculate a bit for the “what and when” that previously heretofore named “JUNK” elements and designed feedback loops? Dependent upon internal and/or external signal processing whereby decision trees, “if-then” type logic “funnels” information(for lack of better words) to appropriate “levels” of reactionary code interfaces? For lack of better term, a checking mechanism and/or cascade of pre-determined sets and switches.

  116. 116
  117. 117
    OLV says:

    gpuccio and DATCG:
    do you guys understand this cross-talk topic?
    are these cross-talks between a bunch of proteins present in prokaryotes and eukaryotes?
    basically at which point did this appear in the biological cells?
    it’s mindboggling either way I look at it.
    do any of these proteins show significant functional information jump at some point?
    Thanks.

    Kinase and Phosphatase Cross-Talk at the Kinetochore

    it is perhaps not surprising to learn that they are, quite literally, entangled in cross-talk.

  118. 118
    gpuccio says:

    OLV:

    “Cross talk” seems to mean that there are complex, ordered interactions between those proteins.

    I have looked into some of them.

    Aurora B: 344 AAs in humans.

    Best hit in prokaryotes: 145 bits
    Best hit in single celled eukaryotes: 401 bits
    Well conserved in all metazoa.
    No significant information jump in vertebrates.
    This seems to be essentially a new eukaryotic protein, sharing low homology with different prokaryotic serine/threonine kinases.

    MPS1: 857 AAs in humans.

    The human sequence has very limited homology in pre-vertebrates, and undergoes a gradual information jump in vertebrates, from cartilaginous fish to reptiles.

    BUB1: 1085 AAs in humans

    The human sequence has very limited homology in pre-vertebrates, and undergoes two major jumps, one (smaller) in cartilaginous fish, the second and bigger one in mammals.

    BUBR1: 1050 AAs in humans:

    Similar to BUB1.

    PLK1: 603 AAs in humans

    Similar to Aurora B:

    Information jump from prokaryotes (132 bits) to single celled eukaryotes (491 bits). Well conserved in metazoa, no big jumps there.

    As you can see, there are two different patterns here.

    Some proteins are essentially engineered in single celled eukaryotes, and their evolutionary history is rather smooth after that, in Metazoa (Aurora B, PLK1). These are essentially new proteins in eukaryotes, even is some low homology can be detected with prokaryotic serine/threonine kinases.

    Another group of proteins, instead, is essentially engineered in its human-conserved sequence in vertebates, but gradually, with discrete jumps in cartilaginous fish, bony fish, amphibians and reptiles, then undergoes a final jump in mammals (BUB1, BUBR1).

    These different evolutionary behaviours could certainly point to different functional specificities.

  119. 119
    OLV says:

    gpuccio:

    Thank you for the clarifying explanation.

  120. 120
    DATCG says:

    Gpuccio @116,

    Thank you! 🙂

  121. 121
    DATCG says:

    OLV @117,

    interesting! That will take some time to get through for me 🙂 And I’ve yet to read the papers Gpuccio recommended earlier.

    Not sure if you were interested in general definitions or examples and diagrams. I enjoy researching NCBI when I have the time. But when in a hurry use Wiki.

    Signaling cascades are cool, included in links below.

    Quick crosstalk link…
    https://en.wikipedia.org/wiki/Crosstalk_(biology)

    I may be overstating the case, but have long thought signal cascades are one more good case for Design, finely tuned communications system that cannot be allowed to blindly form without monitoring and error correction. If a signal is wrong, late, unrecognized, or a receptor mutates and cannot function, it’s over, or leads to disease.

    Signaling is critical to survival.

    Youtube is also a great tool for video 3D type reviews.

    The paper you linked is full of Design criteria in just the first few paragraphs that must meet stringent criteria.

    Given that so many kinases and phosphatases converge onto two key mitotic processes, it is perhaps not surprising to learn that they are, quite literally, entangled in cross-talk.

    Inhibition of any one of these enzymes produces secondary
    effects on all the others,
    which results in a complicated picture that is very difficult to interpret.

    When you tell me they’re “entagled” to me it indicates the complexity is so tightly controlled together a mutatoin in the wrong place can degenerate the overall systems performance.

    I see you’ve commented on Gpuccio’s Ubiquitin post. There’s an interesting reference by one of Dionisio’s comments on the Kinetochore and Ubiquitin at comment #22 as well.

    Kinetochore and fine-tuning

    So, all these disparate systems must coordinate together, signal and make orderly processing together in tight control windows of time. Or kaput, tagged for recycling, maybe modification if fortunate.

    Amazing! 🙂

  122. 122
    gpuccio says:

    DATCG and OLV:

    This seems really interesting:

    Analysis of orthologous lncRNAs in humans and mice and their species-specific epigenetic target genes

    https://www.ncbi.nlm.nih.gov/pubmed/29997097

    Abstract

    OBJECTIVE:
    To identify orthologous lncRNAs in human and mice and the species specificity of their epigenetic regulatory functions.

    METHODS:
    The human/mouse whole-genome pairwise alignment (hg19/mm10, genome.UCSC.edu) was used to identify the orthologues in 13 562 and 10 481 GENCODE-annotated human and mouse lncRNAs. The Infernal program was used to search the orthologous sequences of all the exons of the 13562 human lncRNAs in mouse genome (mm10) to identify the highly conserved orthologues in mice. LongTarget program was used to predict the DNA binding sites of the orthologous lncRNAs in their local genomic regions. Gene Ontology analysis was carried out to examine the functions of genes.

    RESULTS:
    Only 158 orthologous lncRNAs were identified in humans and mice, and many of these orthologues had species-specific DNA binding sites and epigenetic target genes. Some of the epigenetic target genes executed important functions in determining human and mouse phenotypes.

    CONCLUSIONS:
    Only a few human and mouse lncRNAs are orthologues, and most of lncRNAs are species-specific. The orthologous lncRNAs have species-specific epigenetic target genes, and species-specific epigenetic regulation greatly contributes to the differences between humans and mice.

    Article in chinese. Emphasis mine.

  123. 123
    OLV says:

    DATCG (121):

    “all these disparate systems must coordinate together, signal and make orderly processing together in tight control windows of time. Or kaput, tagged for recycling, maybe modification if fortunate.
    Amazing! “

    Amazing seems like an understatement in this case.

    In that thread, which seems utterly relevant, gruccio, Dionisio and you posted so many references to interesting papers that it’s very difficult to keep track of them. It’s practically insane. Please, don’t take it wrong, no offense intended. You’re all doing a magnificent job. Keep it up! Thanks.

  124. 124
    OLV says:

    gpuccio (122):

    That article is very interesting indeed. Thanks.

    “species-specific epigenetic regulation greatly contributes to the differences between humans and mice.”

    How do scientists explain the appearance of such “species-specific epigenetic regulation” through Darwinian processes?

  125. 125
    gpuccio says:

    OLV:

    “How do scientists explain the appearance of such “species-specific epigenetic regulation” through Darwinian processes?”

    They don’t.

  126. 126
    gpuccio says:

    DATCG, OLV:

    Let’s go back to transposons. This is about plants:

    Transposon-Derived Non-coding RNAs and Their Function in Plants.

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5943564/

    Abstract:

    Transposable elements (TEs) are often regarded as harmful genomic factors and indeed they are strongly suppressed by the epigenetic silencing mechanisms. On the other hand, the mobilization of TEs brings about variability of genome and transcriptome which are essential in the survival and evolution of the host species. The vast majority of such controlling TEs influence the neighboring genes in cis by either promoting or repressing the transcriptional activities. Although TEs are highly repetitive in the genomes and transcribed in specific stress conditions or developmental stages, the trans-acting regulatory roles of TE-derived RNAs have been rarely studied. It was only recently that TEs were investigated for their regulatory roles as a form of RNA. Particularly in plants, TEs are ample source of small RNAs such as small interfering (si) RNAs and micro (mi) RNAs. Those TE-derived small RNAs have potentials to affect non-TE transcripts by sequence complementarity, thereby generating novel gene regulatory networks including stress resistance and hybridization barrier. Apart from the small RNAs, a number of long non-coding RNAs (lncRNAs) are originated from TEs in plants. For example, a retrotransposon-derived lncRNA expressed in rice root acts as a decoy RNA or miRNA target mimic which negatively controls miRNA171. The post-transcriptional suppression of miRNA171 in roots ensures the stabilization of the target transcripts encoding SCARECROW-LIKE transcription factors, the key regulators of root development. In this review article, the recent discoveries of the regulatory roles of TE-derived RNAs in plants will be highlighted.

    Emphasis mine.

  127. 127
    gpuccio says:

    DATCG, OLV:

    This is a little bit older (2015), but interesting:

    Transposable elements at the center of the crossroads between embryogenesis, embryonic stem cells, reprogramming, and long non-coding RNAs.

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4624819/

    Abstract:

    Transposable elements (TEs) are mobile genomic sequences of DNA capable of autonomous and non-autonomous duplication. TEs have been highly successful, and nearly half of the human genome now consists of various families of TEs. Originally thought to be non-functional, these elements have been co-opted by animal genomes to perform a variety of physiological functions ranging from TE-derived proteins acting directly in normal biological functions, to innovations in transcription factor logic and influence on epigenetic control of gene expression. During embryonic development, when the genome is epigenetically reprogrammed and DNA-demethylated, TEs are released from repression and show embryonic stage-specific expression, and in human and mouse embryos, intact TE-derived endogenous viral particles can even be detected. A similar process occurs during the reprogramming of somatic cells to pluripotent cells: When the somatic DNA is demethylated, TEs are released from repression. In embryonic stem cells (ESCs), where DNA is hypomethylated, an elaborate system of epigenetic control is employed to suppress TEs, a system that often overlaps with normal epigenetic control of ESC gene expression. Finally, many long non-coding RNAs (lncRNAs) involved in normal ESC function and those assisting or impairing reprogramming contain multiple TEs in their RNA. These TEs may act as regulatory units to recruit RNA-binding proteins and epigenetic modifiers. This review covers how TEs are interlinked with the epigenetic machinery and lncRNAs, and how these links influence each other to modulate aspects of ESCs, embryogenesis, and somatic cell reprogramming.

  128. 128
    gpuccio says:

    DATCG, OLV:

    What about this? (2018):

    Identification of Transposable Elements Contributing to Tissue-Specific Expression of Long Non-Coding RNAs

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5793176/

    Abstract:

    It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher’s exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE–tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression.

    Emphasis mine.

  129. 129
    PeterA says:

    OLV:

    How do scientists explain the appearance of such “species-specific epigenetic regulation” through Darwinian processes?

    gpuccio:

    They don’t.

    Well, that’s a strong affirmation that can be easily falsified. Actually, could it be that the paper written in Chinese contains that explanation?
    Does any of the readers here know Chinese language to tell us?

  130. 130
    OLV says:

    Not sure if this off topic here:

    https://www.tandfonline.com/doi/full/10.1080/03008207.2017.1412432

    Regulation of osteogenesis by long noncoding RNAs: An epigenetic mechanism contributing to bone formation
    Long noncoding RNAs (lncRNAs) have recently emerged as novel regulators of lineage commitment, differentiation, development, viability, and disease progression. Few studies have examined their role in osteogenesis; however, given their critical and wide-ranging roles in other tissues, lncRNAs are most likely vital regulators of osteogenesis. In this study, we extensively characterized lncRNA expression in mesenchymal cells during commitment and differentiation to the osteoblast lineage using a whole transcriptome sequencing approach (RNA-Seq). Using mouse primary mesenchymal stromal cells (mMSC), we identified 1438 annotated lncRNAs expressed during MSC differentiation, 462 of which are differentially expressed. We performed guilt-by-association analysis using lncRNA and mRNA expression profiles to identify lncRNAs influencing MSC commitment and differentiation. These findings open novel dimensions for exploring lncRNAs in regulating normal bone formation and in skeletal disorders.

  131. 131
    OLV says:

    PeterA:

    Do you really think that paper has the explanation ?
    Really?

  132. 132
    gpuccio says:

    OLV at #130:

    Of course it is pertinent.

    The functions of lncRNAs are just starting to be discovered. The simple truth is that we already know about 15000 such structures in humans, and we ignore almost everything about what they do and how they do it.

    At present, most of the new information is coming from studies about specific human diseases, especially cancer. That’s what usually happens: the behaviour of specific molecules in cancer is often the way to understand what they really do in normal conditions.

    But again, the task is not easy: lncRNAs, and other forms of non coding RNAs, are most likely involved in important regulatory functions, often species-specific. Moreover, there are many potential ways they can work, by interacting with chromatin, with other RNAs, with proteins, and so on.

    It’s a wonderful new field, and I am looking forward to important advances.

  133. 133
    gpuccio says:

    OLV at #131:

    I think PeterA is probably kidding. 🙂

    Of course, if that paper made any serious attempt at explaining those things from a darwinian point of view, that would be mentioned in the abstract.

    However, starting to study chinese could not be a bad idea: the chinese really seem to be leaders in this field (and probably in many others). 🙂

  134. 134
    PeterA says:

    gpuccio:

    What you wrote makes sense and answers my posted questions satisfactorily. Thank you.

    Still I think your short answer “they don’t” is easily falsifiable. However, it may never get falsified because it’s true. In order to falsify it all one would have to do is point to a valid paper containing the Darwinian explanation.

  135. 135
    gpuccio says:

    PeterA:

    I knew you were a fan of Popper! 🙂

  136. 136
    gpuccio says:

    DATCG, OLV, PeterA:

    In primates:

    Conserved expression of transposon-derived non-coding transcripts in primate stem cells

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5331655/

    Abstract
    BACKGROUND:
    A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs).

    RESULTS:
    To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved.

    CONCLUSIONS:
    Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.

  137. 137
    DATCG says:

    Gpuccio @122,

    Interesting! A quick search on the paper shows three authors.
    One, Yang-Yang Jiang, is from the Institute of Process Engineering. That’s all the info I could find on that one particular author.

    Assumption – Jiang is an engineer. Maybe a general process engineer. Or, maybe specialized with a background in chemical and/or biotechnical expertise, or industrial processing and software design of processes.

    Why does research in this field require a “Process Engineer?” if this is all unguided, blind evolution? Because they’re recognizing the need to reverse engineer an “unguided process?”

    I’m aware many universities have expanded their programs to include engineering hard sciences in the field of molecular biology and genetics. That’s great. It’s the only way they will eventually reverse engineer the code(s) of life.

    I wonder if the Chinese are taking a more practical approach to reverse engineering since they’re not tied down by western past of Darwinian worship?

    “… required” is key…

    A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs).

    Hmmmm, remember that evolutionist in the past thought these transposable elements to be ancestral vestiges of viruses?

    If by design however, they’d be modular components, recognized by the system code(s).

    Would an unguided, uncoded, blind “unprocess” be able to recognize a modular component insert? How?

    By deception as a virus? Hmmmm… I wonder if extrapolation of beliefs and assumptions on bacteria leads to possible misconceptions farther up a long series and chain of events?

    There always seems to be a lot of contortion in Darwinism and twisting to fit in preconceived assumptions.

    Again, just speculation. But if by Design, modularity is key for component processing, easily recognizable inserts adjust and radiate through the different phyla. So, it’s not a virus, but a Designed system of rapid radiation of information. This might explain such events as the Cambrian.

  138. 138
    DATCG says:

    Here’s another paper recently on “noncoding regulatory elements” that contribute to new phenotypes and “Accelerated Evolution.”

    PDF document at Cell.com…

    Accelerated Evolution in Distinctive Species Reveals Candidate Elements for Clinically Relevant Traits, Including Mutation and Cancer Resistance

    Note in the paper it mentions E3-Ligases. Will cross-post in your Ubiqutin article Gpuccio.

    Summary:

    The identity of most functional elements in the mammalian genome and the phenotypes they impact are unclear.

    Here, we perform a genome wide comparative analysis of patterns of accelerated evolution in species with highly distinctive traits to discover candidate functional elements for clinically important phenotypes. We identify accelerated regions (ARs) in the elephant, hibernating bat, orca,
    dolphin, naked mole rat, and thirteen-lined ground squirrel lineages in mammalian conserved regions, uncovering 33,000 elements that bind hundreds of different regulatory proteins in humans and mice.

    ARs in the elephant, the largest land mammal, are uniquely enriched near elephant DNA damage response genes. The genomic hotspot for elephant ARs is the E3 ligase subunit of the Fanconi anemia complex, a master regulator of DNA repair.

    Additionally, ARs in the six species are associated with specific human clinical phenotypes that have apparent concordance with overt traits in each species.

    Interesting! Ubiquitin is not mentioned, but is in play at work with E3 Ligase. Keeping in mind that “elements” are information packets that can be read for processing. Continuing on…

    New phenotypes frequently arise due to evolutionary changes
    to “noncoding” regulatory elements rather than protein-coding changes (Carroll, 2008; Wray, 2007). Although much of the genome is biochemically active (ENCODE Project Consortium, 2012), identifying functional elements for particular traits is challenging, and the best approaches are debated (Kellis et al., 2014). One approach is to focus on conserved genomic regions. Indeed, species-specific changes to conserved noncoding elements are linked to some major phenotypic effects, such as the loss of limbs in the snake (Kaltcheva and Lewandoski, 2016; Kvon et al., 2016) and the loss of penile spines in humans (McLean et al., 2011). Conserved elements exhibiting accelerated evolution in a particular species may have roles in shaping the traits of that species (Bird et al., 2007; Boyd et al., 2015; Capra et al., 2013; Hubisz et al., 2011; Kim and Pritchard, 2007; Lindblad-Toh et al., 2011; Pollard et al., 2006a, 2006b, 2010; Prabhakar et al., 2006).

    Accelerated regions (ARs) are best known from studies of human ARs and are conserved elements with significantly increased nucleotide substitution rates due to the effects of positive selection, relaxed purifying selection, or GC-biased gene conversion in a particular lineage (Hubisz and Pollard, 2014; Kostka et al., 2012; Pollard et al., 2010). For example, one human AR is an enhancer with putative roles in the evolution of the human thumb (Prabhakar et al., 2008). Despite these advances, the identity and roles of most functional elements in the mammalian genome remain unclear.

    “noncoded” above emphasis mine

    And following discussion…

    Discussion

    DISCUSSION
    Our study tested whether a comparative analysis of ARs in
    species with distinctive traits facilitates the discovery of candidate functional elements for the overt and clinically relevant traits exhibited by these species. From elephant, Hib bat, orca, dolphin, mole rat, and squirrel ARs, we identified a set of 33,283 candidate elements (7% of mammalian conserved regions tested). Multiple lines of evidence support the functionality of these elements, including selective constraint (conservation) from wallaby to human, regulatory protein binding in humans and mice, and evidence for accelerated evolution in specific lineages.

  139. 139
    DATCG says:

    So going offtopic a bit, but I consider TEs module components and essentially genetic code as a modular system of component parts. So from TEs to larger molecules, modularity is key…

    During a quick search, as I was posting #138, found this gem from 2015 on the E3-Ligase and Fanconi Anemia(FA) Core Complex. I’ll cross post it as well on Ubiquitin post.

    Modularized Functions
    of the Fanconi Anemia Core Complex

    Summary

    The Fanconi anemia (FA) core complex provides the essential E3 ligase function for spatially defined FANCD2 ubiquitination and FA pathway activation.

    Of the seven FA gene products forming the core complex, FANCL possesses a RING domain with demonstrated E3 ligase activity. The other six components do not have clearly defined roles.

    Through epistasis analyses, we identify three functional modules in the FA core complex: a catalytic module consisting of FANCL, FANCB, and FAAP100 is absolutely required for the E3 ligase function, and the FANCAFANCG-FAAP20 and the FANCC-FANCE-FANCF modules provide nonredundant and ancillary functions that help the catalytic module bind chromatin or sites of DNA damage.

    Disruption of the catalytic module causes complete loss of the core complex function, whereas loss of any ancillary module component does not. Our work reveals the roles of
    several FA gene products with previously undefined
    functions and a modularized assembly of the FA core complex

    Discussion

    Through creation of isogenic single and double mutants of the FA genes, we came to the unexpected finding that not all the components of the core complex contribute equally to cellular resistance against DNA damage. This observation deviates from a general paradigm that losing any one of the core FA proteins leads to complete elimination of FANCD2 activation and disintegration of the core complex. Instead, our results suggest that different functional modules exist in the core and that the overall integrity of the core complex is sustained when certain FA proteins are removed.

    This points to Design. Elimination of specific modules did not cause a problem in this case for a specific item of research. Allowing continued specific processing of the molecule for their research.

    Prediction? Researchers will find errors or inoperative function(s) in other programming aspects maybe yet unknown upon removal of the specific modules.

    If so, this adds to the modularity principle of Design Concepts. Merely removing or eliminating code can only eliminate specific areas of performance, while allowing performance for the other modules to remain.

    Thus the advantages of modular programming are not only efficiency of coding, but of survival. If this was not a modular process, function would cease. Unguided events cannot anticipate such damage or requirements of modularity. Only Designers can.

    Our work establishes a modularized functional assembly of the FA core complex consisting of a catalytic module and two modules with nonredundant functions in the chromatin recruitment of the core complex. The coordinated actions of these three modules enable the E3 ligase activity to be localized to the sites of DNA damage and carry out the spatially defined FANCD2/I monoubiquitination with maximum efficiency to counter DNA damage. The catalytic core module is the most critical component functionally, as reflected by the most severe phenotypes of the FANCL and FANCB mutants. The existence of a catalytical core module within the FA core complex is further supported by biochemical evidence from Rajendra et al. (2014)

    Modular Programming increases efficiency and depends upon short, fast reads of easily recognizable sequences of pre-Coded Elements like the TE’s in this post, or Tagging by Ubiqutin for Repair Processing designation or degradation and recycling of parts.

  140. 140
    gpuccio says:

    DATCG:

    Great contributions, as usual! 🙂

    There can be no doubt about the modularity of biological design: it’s OOP all the way!

    I agree that bio-engineers are badly needed. Bio-informaticians, who usually have a mixed education, will certainly help a lot.

    The mechanisms of the Fanconi complex are specially intriguing. Repairing DNA is certainly a delicate task.

    I like very much this passage from the paper you quoted:

    The E3 ligase activity of this reaction resides in the FA core complex consisting of seven FA proteins (FANCA, FANCB, FANCC, FANCE, FANCF, FANCG, and FANCL) and two FA-associated proteins (FAAP20 and FAAP100), with the RING domain protein FANCL bearing the E3 ligase activity (Alpi et al., 2008; Meetei et al., 2003). Aside from FANCL and FAAP20, most other components of the core complex have neither recognizable motifs nor clearly defined functions as to how they contribute to the DNA damage-mediated FANCD2/I monoubiquitination.

    Emphasis mine.

    That’s what happens as soon as we enter the vastly misunderstood field of function regulation: the usual concepts of protein domains, functional conservation and “easily” recognizable function just disappear. And we are left with extremely complexd and highly functional multi-structures where we cannot recognize anything easily understandable.

    The same is true, IMO, for non coding DNA.

    There are functions that we understand, and functions that we still don’t understand at all. Most regulatory functions are in the second group.

  141. 141
    OLV says:

    Thanks to DATCG and gpuccio for maintaining such a high level of scientific excitement in this discussion. Excellent work.

  142. 142
    gpuccio says:

    DATCG:

    This is a “pearl”:

    Regulation of IL-17 by lncRNA of IRF-2 in the pearl oyster.

    https://www.ncbi.nlm.nih.gov/pubmed/30017925

    Abstract
    Long noncoding RNAs (lncRNAs), once thought to be nonfunctional, have recently been shown to participate in the multilevel regulation of transcriptional, posttranscriptional and epigenetic modifications and to play important roles in various biological processes, including immune responses. However, the expression and roles of lncRNAs in invertebrates, especially nonmodel organisms, remain poorly understood. In this study, by comparing a transcriptome to the PfIRF-2 genomic structure, we identified lncIRF-2 in the PfIRF-2 genomic intron. The results of the RNA interference (RNAi) and the nucleus grafting experiments indicated that PfIRF-2 might have a negative regulatory effect on lncIRF-2, and PfIRF-2 and lncIRF-2 may have a positive regulatory effect on PfIL-17. Additionally, lncIRF-2, PfIRF-2 and PfIL-17 were involved in responses to the nucleus graft. These results will enhance the knowledge of lncIRF-2, IRF-2, and IL-17 functions in both pearl oysters and other invertebrates.

    Emphasis mine.

    Nice interaction between an intron and lncRNA.

    According to the genomic distribution into one or more of the following five categories, sense, antisense, bidirectional, intronic and intergenic [29], lncIRF-2 belonged to intronic lncRNA. Meanwhile, the alignment of lncIRF-2 yielded no matches via the NCBI BlastN search, suggesting that lncIRF-2 was lowly conserved, like most of its lncRNA counterparts that were not conserved.

    This study explored the correlation of lncIRF-2 and PfIRF-2 and their regulation on PfIL-17. In addition, it has been shown that lncIRF-2 could be functional, not just transcriptional ‘noise’.

    Emphasis mine.

    So, a functional lncRNA which derives from an intron and is “lowly conserved”, IOWs highly species specific. It’s becoming a very strong trend, I would say. 🙂

  143. 143
    gpuccio says:

    DATCG:

    And the field seems to rapidly expand:

    New and Prospective Roles for lncRNAs in Organelle Formation and Function.

    https://www.ncbi.nlm.nih.gov/pubmed/30017312

    Abstract
    The observation that long noncoding RNAs (lncRNAs) represent the majority of transcripts in humans has led to a rapid increase in interest and study. Most of this interest has focused on their roles in the nucleus. However, increasing evidence is beginning to reveal even more functions outside the nucleus, and even outside cells. Many of these roles are mediated by newly discovered properties, including the ability of lncRNAs to interact with lipids, membranes, and disordered protein domains, and to form differentially soluble RNA-protein sub-organelles. This review explores the possibilities enabled by these new properties and abilities, such as likely roles in exosome formation and function.

    Emphasis mine.

  144. 144
    gpuccio says:

    DATCG, OLV:

    In the testis:

    Profiling of testis-specific long noncoding RNAs in mice

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048885/

    Abstract:

    Background
    Spermatogenesis, which is the complex and highly regulated process of producing haploid spermatozoa, involves testis-specific transcripts. Recent studies have discovered that long noncoding RNAs (lncRNAs) are novel regulatory molecules that play important roles in various biological processes. However, there has been no report on the comprehensive identification of testis-specific lncRNAs in mice.

    Results
    We performed microarray analysis of transcripts from mouse brain, heart, kidney, liver and testis. We found that testis harbored the highest proportion of tissue-specific lncRNAs (11%; 1607 of 14,256). Testis also harbored the largest number of tissue-specific mRNAs among the examined tissues, but the proportion was lower than that of lncRNAs (7%; 1090 of 16,587). We categorized the testis-specific lncRNAs and found that a large portion corresponded to long intergenic ncRNAs (lincRNAs). Genomic analysis identified 250 protein-coding genes located near (? 10 kb) 194 of the loci encoding testis-specific lincRNAs. Gene ontology (GO) analysis showed that these protein-coding genes were enriched for transcriptional regulation-related terms. Analysis of male germ cell-related cell lines (F9, GC-1 and GC-2) revealed that some of the testis-specific lncRNAs were expressed in each of these cell lines. Finally, we arbitrarily selected 26 testis-specific lncRNAs and performed in vitro expression analysis. Our results revealed that all of them were expressed exclusively in the testis, and 23 of the 26 showed germ cell-specific expression.

    Conclusion
    This study provides a catalog of testis-specific lncRNAs and a basis for future investigation of the lncRNAs involved in spermatogenesis and testicular functions.

    Emphasis mine.

    This is very important. Testis and brain are always the pinnacle of functional specificity. The evidence for the fundamental role of lncRNAs is growing amazingly.

  145. 145
    gpuccio says:

    DATCG, OLV:

    Another interesting features of lncRNAs seems to be: the names we give them! 🙂

    The lncRNA male-specific abdominal plays a critical role in Drosophila accessory gland development and male fertility.

    https://www.ncbi.nlm.nih.gov/pubmed/30011265

    Parentally inherited long non-coding RNA Cyrano is involved in zebrafish neurodevelopment.

    https://www.ncbi.nlm.nih.gov/pubmed/30011017

    Male-specific abdominal? Cyrano? !!!! 🙂

  146. 146
    gpuccio says:

    DATCG, OLV:

    As I said, testis and brain:

    Forging our understanding of lncRNAs in the brain.

    https://www.ncbi.nlm.nih.gov/pubmed/29079882

    Abstract
    During both development and adulthood, the human brain expresses many thousands of long noncoding RNAs (lncRNAs), and aberrant lncRNA expression has been associated with a wide range of neurological diseases. Although the biological significance of most lncRNAs remains to be discovered, it is now clear that certain lncRNAs carry out important functions in neurodevelopment, neural cell function, and perhaps even diseases of the human brain. Given the relatively inclusive definition of lncRNAs-transcripts longer than 200 nucleotides with essentially no protein coding potential-this class of noncoding transcript is both large and very diverse. Furthermore, emerging data indicate that lncRNA genes can act via multiple, non-mutually exclusive molecular mechanisms, and specific functions are difficult to predict from lncRNA expression or sequence alone. Thus, the different experimental approaches used to explore the role of a lncRNA might each shed light upon distinct facets of its overall molecular mechanism, and combining multiple approaches may be necessary to fully illuminate the function of any particular lncRNA. To understand how lncRNAs affect brain development and neurological disease, in vivo studies of lncRNA function are required. Thus, in this review, we focus our discussion upon a small set of neural lncRNAs that have been experimentally manipulated in mice. Together, these examples illustrate how studies of individual lncRNAs using multiple experimental approaches can help reveal the richness and complexity of lncRNA function in both neurodevelopment and diseases of the brain.

  147. 147
    DATCG says:

    Ha! @145 Gpuccio!

    You’re having to much fun! 🙂 And apparently so are the scientist with their naming conventions! 😉 Inspirational
    as it is…

    http://lncrnadb.com/cyrano/

    “… enlarged nasal placodes”

    heh!

    https://www.mirror.co.uk/news/uk-news/four-genes-determine-you-nose-8007124

    The nose knows! 😉

  148. 148
    gpuccio says:

    DATCG:

    Ah, Cyrano and enlarged nasal placodes! I get it! 🙂

    I blasted the zebrafish sequence against homo sapiens, and indeed I found only the almost perfect conservation of a 66 nt sequence (56/66 identities), and nothing else out of 4630 nt in zebrafish and 9027 in humans. As described in your link:

    “Long terminal exon shows a number of conserved sequences within tetrapods. One ~300nt highly conserved sequence contains a 67 nt sequence conserved between tetrapods and zebrafish, a 26nt subregion of which is almost perfectly conserved in vertebrates and is hypothesised to be a miR-7 binding site.”

    And yet:

    “Has similar structural characteristics in different vertebrate species.”

    It seems that structural conservation works rather differently in RNAs than in proteins!

  149. 149
    PeterA says:

    Wow!
    Y’all are making me dizzy with so much interesting information being thrown in your comments. How do you get all those papers so fast?
    From what you guys are saying the field of NC RNA appears growing out of control, though ironically that stuff seems related to controls!
    BTW, I see that you’re having fun with all this. However, there may be some folks out there who aren’t enjoying this.

  150. 150
    gpuccio says:

    PeterA:

    Thank you for your comments! 🙂

    “Y’all are making me dizzy with so much interesting information being thrown in your comments. How do you get all those papers so fast?”

    Pubmed, search engine and common sense, I suppose. 🙂

    “From what you guys are saying the field of NC RNA appears growing out of control, though ironically that stuff seems related to controls!”

    Absolutely! These are really exciting news, especially for us IDists.

    “BTW, I see that you’re having fun with all this.”

    And a lot of it!

    “However, there may be some folks out there who aren’t enjoying this.”

    Life is about personal choices… 🙂

  151. 151
    john_a_designer says:

    gpuccio and DATCG @ 98:

    DATCG: “I’m having a hard time wrapping my head around Designed macro-evolution because it seems like there are huge issues unless the functional relationships are all Prescribed.”

    gpuccio: Why shouldn’t they?

    DATCG: “This is why I tend to think we have multiple bases of life, or bushes of life if you will, not a single Tree of Life.”

    gpuccio responded:

    It’s possible, but not really necessary.

    Let’s take the Cambrian explosion, for example.

    The phyla are certainly different and complex designs. Each of them individually programmed.

    But we have good reasons to believe, form facts, that all of them use a shared bulk of functionality which had already been designed in single celled eukaryota.

    IOWs, both chordata and sponges, just to be clear, are built using similar eukaryotic cells. Even if the [plan] is completely different. [emphasis added]

    Do you have any thoughts about the on-going work that is being done on choanoflagellates? They are seen as a transitional form from single cell to multi-celled eukaryota because for some reason they have inter-cellular signaling proteins. (Why would a single cell eukaryote need inter-cellular signaling proteins?) Evidence of evolution for sure. But does this support blind watchmaker Darwinian evolution or is it evidence of guided and directed (designed) evolution.

    See my comments 8 and 13 on this thread:

    https://uncommondescent.com/intelligent-design/at-science-maybe-the-transition-from-single-cells-to-multicellular-life-wasnt-that-hard/#comment-661859

  152. 152
    gpuccio says:

    john_a_designer:

    Yes, I am aware that many proteins and systems that are really aimed at multicellularity do appear before. I have read your comments, and I think that essentially I do agree with you.

    Of course all this is completely incompatible with blind watchmakers of any kind, while it makes perfect sense in the light of design.

    As you may know, I believe in descent by design, and I think that a gradual implementation of design plans is definitely a possibility.

    As I have said many times, my idea of descent is simply that functional modules are reused in new plans, but they are reused as they are in existing organisms, and they carry with them all the random non functional variations that are the signature of elapsed time.

    Of course, all that is functionally new must be designed, either “de novo” or by substantial re-engineering what already exists. We don’t know how suddenly new plans appear, that is something that will have to be answered by facts, in time.

    But there is definitely a general plan which evolves through living beings, and its main purpose seems to be to manifest a growing variety and depth of function and complexity in the realm of life.

    Many of the basic solutions are retained from prokaryotes to humans: for example, the genetic code, and ATP synthase, and a lot of other things.

    But a lot of new or different solutions have been engineered in the course of time.

    That’s why I have never agreed with those who think that the big problem is OOL, and the rest is just a minor issue. That’s not true, not at all.

    Yes, OOL is a big problem. A very big one.

    But so it is the appearance of eukaryotes, and of multicellular life. And of each single phylum and body plan.

    And, in the end, of each single new functional protein, or of any single new complex re-engineering of an existing functional protein.

    Each of those things is really beyond the resources of any non design system in the whole universe. Each and all of them are designed.

    Design is an universal, constant process that manifests throughout the whole history of life on our planet (and maybe elsewhere).

  153. 153
    OLV says:

    gpuccio, DATCG:

    I’m trying to process the interesting information you’ve posted lately. Really hard to catch up. Definitely you have preemptively overwhelmed any potential opponent in this discussion. Well done! Thanks.

  154. 154
  155. 155
    DATCG says:

    Gpuccio @152 and John_a_Designer @151,

    John, Good points and questions.

    Gpuccio, Nice summary yet again, in your response.

    Gpuccio,
    Do you think we will be able to find common mechanisms in the future that accounts for “convergence” which is often an appeal by Darwinist to solve a very complex problem for them as a blind series of events across multiple life forms?

    And a bit off topic maybe, but a question that always keeps me fascinated in the unfolding aspect of life from the past to the present form(s).

    Do you think evolution of macro changes in life forms has stopped?

  156. 156
    DATCG says:

    OLV @153,

    You and me both 😉

  157. 157
    gpuccio says:

    DATCG:

    Your questions are always stimulating! 🙂

    “Do you think we will be able to find common mechanisms in the future that accounts for “convergence” which is often an appeal by Darwinist to solve a very complex problem for them as a blind series of events across multiple life forms?”

    “Convergent” evolution really makes sense in a design perspective. It is simply an example of “convergent solutions”.

    The same problem can be solved in different ways, or even in similar ways, bu in different contexts.

    Design is a process which starts from ideas, from conscious models, and is realized and implemented by specific programming solutions.

    We know that flight has appeared independently in many different lines of animals: insects, birds, mammals. Of course there is something in common in those solutions, because the problem is similar, but the solutions themselves are very different.

    My model of descent is very powerful to explain that. Where specific implementations are re-used, at the level of specific code, for example the specific sequence of a protein in different species, it is very likely that the code has been physically transerred, and re-used. But when it’s the general idea that is re-used, with differne specific codes, then there is probably no physical derivation, only a similarity in the idea itself.

    Those concepts remain valid both in the case of one designer or in the case of multiple designers.

    “Do you think evolution of macro changes in life forms has stopped?”

    Good question. There is no reason, IMO, to think that it has stopped. Biological design has certainly been active till very recently in evolutionary history (look at us humans, for example). However, I don’t think that we have at present reliable facts to decide.

  158. 158
  159. 159
    DATCG says:

    Gpuccio @158, will do 🙂

    Had no idea that discussion was still ongoing! Ha! You have many post still stirring the proverbial Darwin pot I see 🙂

  160. 160
  161. 161
    PeterA says:

    A few guys (gpuccio, DATCG, lately OLV) are posting many references to biology papers, making the rest of the readers here wonder how they manage to find so many interesting papers in such a short time.
    I found this 4-year old article that seems to shed some light on the subject:

    Publication Growth in Biological Sub-Fields: Patterns, Predictability and Sustainability
    Marco Pautasso
    Received: 12 June 2012; in revised form: 5 November 2012 / Accepted: 19 November 2012 / Published: 23 November 2012
    Abstract
    : Biologists are producing ever-increasing quantities of papers. The question arises of whether current rates of increase in scientific outputs are sustainable in the long term. I studied this issue using publication data from the Web of Science (1991–2010) for 18 biological sub-fields. In the majority of cases, an exponential regression explains more variation than a linear one in the number of papers published each year as a function of publication year. Exponential growth in publication numbers is clearly not sustainable. About 75% of the variation in publication growth among biological sub-fields over the two studied decades can be predicted by publication data from the first six years. Currently trendy fields such as structural biology, neuroscience and biomaterials cannot be expected to carry on growing at the current pace, because in a few decades they would produce more papers than the whole of biology combined. Synthetic and systems biology are problematic from the point of view of knowledge dissemination, because in these fields more than 80% of existing papers have been published over the last five years. The evidence presented here casts a shadow on how sustainable the recent increase in scientific publications can be in the long term.

    http://www.mdpi.com/2071-1050/4/12/3234/htm

  162. 162
    PeterA says:

    Sorry, I incorrectly wrote “4-year old” instead of “6-year old”.

  163. 163
    PeterA says:

    Here’s another :

    Journal of the Association for Information Science and Technology Volume 66, Issue 11
    RESEARCH ARTICLE
    Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references

    Lutz Bornmann Rüdiger Mutz
    First published: 29 April 2015
    https://doi.org/10.1002/asi.23329

  164. 164
    PeterA says:

    In any case you guys are doing a formidable job.
    Thanks.

Leave a Reply