Intelligent Design

Transcription regulation: a miracle of engineering

Spread the love

Transcription is certainly the essential node in the complex network of procedures and regulations that control the many activities of living cells. Understanding how it works is a fascinating adventure in the world of design and engineering. The issue is huge and complex, but I will try to give here a simple (but probably not too brief) outline of its main features, always from a design perspective.

 


Fig. 1 A simple and effective summary of a gene regulatory network

 

Introduction: where is the information?

One of the greatest mysteries in cell life is how the information stored in the cell itself can dynamically control the many changes that continuosly take place in living cells and in living beings. So, the first question is: what is this information, and where is it stored?

Of course, the classical answer is that it is in DNA, and in particular in protein coding genes. But we know that today that answer is not enough.

Indeed, a cell is an ever changing reality. If we take a cell, any cell, at some specific time t, that cell is the repository of a lot of information, at that moment and in that state. That information can be grossly divided in (at least) two different compartments:

a) Genomic information, which is stored in the sequence of nucleotides in the genome. This information is relatively stable and, with a few important exceptions, is the same in all the cells of a multicellular being.

b) Non genomic information. This includes all  the specific configurations which are present in that cell at time t, and in particular all epigenetic information (configurations that modify the state of the genomic information) and, more generally, all configurations in the cell. The main components of this dynamic information are the cell transcriptome and proteome at time t and the sum total of its chromatin configurations.

Now, let’s try to imagine the flow of dynamic information in the cell as a continuous interaction between these two big levels of organization:

  1. The transcriptome/proteome is the sum total of all proteins and RNAs (and maybe other functional molecules) that are present in the cell at time t, and which define what the cell is and does at that time.
  2. The chromatin configuration can be considered as a special “reading” of the genomic information, individualized by many levels of epigenetic control. IOWs, while the genomic information is more or less the same in all cells, it can be expressed in myriads of different ways, according to the chromatin organization at that moment, which determines what genes or parts of the genome are “available” at time t in the cell. In this way, one genomic sequence can be read in multiple different ways, with different functional meanings and effects. So, if we just stick to protein coding genes, the 20000 genes in the human genome are available only partially in each cell at each moment, and that allows for a myriad of combinatorial dynamic “readings” of the one stable genome.

Fig. 2 shows the general form of these concepts.

 

Fig. 2

 

Two important points:

  • The interaction between transcriptome/proteome and chromatin configuration is, indeed, an interaction. The transcriptome/proteome determines the chromatin configuration in many ways: for example, changing the methylation of DNA (DNA methyltransferases); or modifying the post-trascriptional modifications (methylation, acetylation, ubiquitination and others) of histones (covalent histone-modifying complexes), or creating new loops in chromatin (transcription factors); or directly remodeling chromatin itself (ATP-dependent chromatin remodeling complexes). In the same way, any modification of the chromatin landscape immediately influences what the existing transcriptome/proteome is and can do, because it directly changes the transcriptome/proteome as a result of the changes in gene transcription. Of course, this can modify the availability of genes, promoters, enhancers, and regulatory regions in general at chromatin level. That’s the meaning of the two big red arrows connecting, at each stage, the two levels of regulation. The same concept is evident in Fig. 1, which shows how the output of transcription has immediate, complex and constant feedback on transcription regulation itself.
  • As a result of the continuous changes in the trascriptome/proteome and in chromatin configurations, cell states continuously change in time (yellow arrows). However, this continuous flow of different functional states in each cell can have two different meanings, as shown by the two alternative big brown arrows on the right:
    •  Cells can change dramatically, following a definite developmental pathaway: that’s what happens in cell differentiation, for example from a haematopoietic stem cell to differentiated blood cells like lymphocytes, monocytes, neutrophils, and so on. The end of the differetiation is the final differentiated cell, which is in a sense more “stable”, having reached its final intended “form”.
    • Those “stable” differentiated cells, however, are still in a continuous flow of informational change, which is still drawn by continuous modifications in the transcriptome/proteome and in chromatin configurations. Even if these changes are less dramatic, and do not change the basic identity of the differentiated cell, still they are necessary to allow adaptation to different contexts, for example varying messages from near cells or from the rest of the body, either hormonal, or neurologic, or other, or other stimuli from the environment (for example, metabolic conditions, stress, and so on), or even simply the adherence to circadian (or other) rythms. IOWs, “stable” cells are not stable at all: they change continuously, while retaining their basic cell identity, and those changes are, again, drawn by continuous modifications in the transcriptome/proteome and in the chromatin configurations of the cell.

Now, let’s have a look at the main components that make the whole process possible. I will mention only briefly the things that have been known for a long time, and will give more attention to the components for which there is some recent deeper understanding available.

We start with those components that are part of the DNA sequence itself, IOWs the genes themselves and those regions of DNA which are involved in their trancription regulation (cis-regulatory elements).

 

Cis elements

 

Genes and promoters.

Of course, genes are the oldest characters in this play. We have the 20000 protein coding genes in human genome, which represent about 1.5% of the whole genomic sequence of 3 billion base pairs. But we must certainly add the genes that code for non coding RNAs: at present, about 15000 genes for long non coding RNAs, and about 5000 genes for small non coding RNAs, and about 15000 pseudogenes. So, the concept of gene is now very different than in the past, and it includes many DNA sequences that have nothing to do with protein coding. Moreover, it is interesting to observe that many non protein coding genes, in particular those that code for lncRNAs, have a complex exon-intron structure, like protein coding genes, and undego splicing, and even alternative splicing. For a good recent review about lncRNAs, see here:

The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression

Let’s go to promoters. This is the simple definition (from Wikipedia):

In genetics, a promoter is a region of DNA that initiates transcription of a particular gene. Promoters are located near the transcription start sites of genes, on the same strand and upstream on the DNA (towards the 5′ region of the sense strand). Promoters can be about 100–1000 base pairs long.

A promoter includes:

  • The transcription start site (TSS), IOWs the point where transcription starts
  • A binding site for RNA polymerase
  • General transcription factors binding sites for , such as the TATA box and the BRE in eukaryotes
  • Other parts that can interact with different regulatory elements.

Promoters have been classified as  ‘focused’ or ‘sharp’ promoters (those that have a single, well-defined TSS), and  ‘dispersed’ or ‘broad’ promoters (those that have multiple closely spaced TSS that are used with similar frequency).

For a recent review of promoters and their features, see here:

Eukaryotic core promoters and the functional basis of transcription initiation

 

Enhancers

Enhancers are a fascinating, and still poorly understood, issue. Again, here is the definition from Wikipedia:

In genetics, an enhancer is a short (50–1500 bp) region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp (1,000,000 bp) away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes.

Enhancers are elusive things. The following paper:

Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells

reports a total of 201,802 identified promoters and 65,423 identified enhancers in humans, and similar numbers in mouse (this in 2015). But there are probably many more than that number.

Working with specific TFs, enhancers are the main responsibles of the formation of dynamic chromatin loops, as we will see later.

Here is a recent paper about human enhancers in different tissues:

Genome-wide Identification and Characterization of Enhancers Across 10 Human Tissues.

 

Abstract:

Background: Enhancers can act as cis-regulatory elements (CREs) to control development and cellular function by regulating gene expression in a tissue-specific and ubiquitous manner. However, the regulatory network and characteristic of different types of enhancers(e.g., transcribed/non-transcribed enhancers, tissue-specific/ubiquitous enhancers) across multiple tissues are still unclear. Results: Here, a total of 53,924 active enhancers and 10,307 enhancer-associated RNAs (eRNAs) in 10 tissues (adrenal, brain, breast, heart, liver, lung, ovary, placenta, skeletal muscle and kidney) were identified through the integration of histone modifications (H3K4me1, H3K27ac and H3K4me3) and DNase I hypersensitive sites (DHSs) data. Moreover, 40,101 tissue-specific enhancers (TS-Enh), 1,241 ubiquitously expressed enhancers (UE-Enh) as well as transcribed enhancers (T-Enh), including 7,727 unidirectionally transcribed enhancers (1D-Enh) and 1,215 bidirectionally transcribed enhancers (2D-Enh) were defined in 10 tissues. The results show that enhancers exhibited high GC content, genomic variants and transcription factor binding sites (TFBS) enrichment in all tissues. These characteristics were significantly different between TS-Enh and UE-Enh, T-Enh and NT-Enh, 2D-Enh and 1D-Enh. Furt hermore, the results showed that enhancers obviously upregulate the expression of adjacent target genes which were remarkably correlated with the functions of corresponding tissues. Finally, a free user-friendly tissue-specific enhancer database, TiED (http://lcbb.swjtu.edu.cn/TiED), has been built to store, visualize, and confer these results. Conclusion: Genome-wide analysis of the regulatory network and characteristic of various types of enhancers showed that enhancers associated with TFs, eRNAs and target genes appeared in tissue specificity and function across different tissues.

Promoter and enhancer associated RNAs

A very interesting point which has been recently clarified is that both promoters and enhancers, when active, are transcribed. IOWs, beyond their classical action as cis regulatory elements (DNA sequences that bind trans factors), they also generate specific non coding RNAs. They are called respectively Promoter-associated RNAs (PARs) and Enhancer RNAs (eRNAs). They can be short or long, and both types seem to be functional in transcription regulation.

Here is a recent paper that reciews what is known of PARs, and their “cousins” terminus-associated RNAs (TARs):

Classification of Transcription Boundary-Associated RNAs (TBARs) in Animals and Plants

Here, instead, is a recent review about eRNAS:

Enhancer RNAs (eRNAs): New Insights into Gene Transcription and Disease Treatment

Abstract:

Enhancers are cis-acting elements that have the ability to increase the expression of target genes. Recent studies have shown that enhancers can act as transcriptional units for the production of enhancer RNAs (eRNAs), which are hallmarks of activity enhancers and are involved in the regulation of gene transcription. The in-depth study of eRNAs is of great significance for us to better understand enhancer function and transcriptional regulation in various diseases. Therefore, eRNAs may be a potential therapeutic target for diseases. Here, we review the current knowledge of the characteristics of eRNAs, the molecular mechanisms of eRNAs action, as well as diseases related to dysregulation of eRNAs.

 

So, this is a brief description of the essential cis regulatory elements. Let’s go now to trans regulatory elements, IOWs those molecules that are not part of the DNA sequence, but work on it to regulate gene transcription.

 

Trans elements

The first group of trans acting tools includes those molecules that are the same for all transcriptions. They are “general” transcription tools.

I will start with a brief mention of RNA polymerase, which is not a regulatory element, but rather the true effector of transcription:

 

DNA-directed RNA polymerases

This is a family of enzymes found in all living organisms. They open the double-stranded DNA and implement the transciption, synthesizing RNA from the DNA template.

I don’t want to deal in detail with this complex subject: suffice it to say, for the moment, that RNA polymerases are very big and very complex proteins, with some basic information shared fron prokaryotes to multicellular organisms. In humans, RNA polymerase II is the one responsible of the transcription of protein coding mRMAs, and of some non coding RNAs, including many lncRNAs. Just as an example, human RNA Pol II is a multiprotein complex of 12 subunits, for a sum total of more than 4500 AAs.

 

Now, let’s go to the general regulatory elements:

General TFs

Grneral TFs are transcription factors that bind to promoter to allow the start of transcription. They are called “general” because they are common to all transcriptions, while specific TFs act on specific genes.

In bacteria there is one general TF, the sigma factor, with different variants.

In archaea and in eukaryotes there are a few. In eukaryotes, there are six. The first that binds to the promoter is TFIID, a multiprotein factor which includes as its core the TBP (TATA binding protein, 339 AAs in humans), plus 14 additional subunits (TAFs), the biggest of which, TAF1, is 1872 AA long in humans. Four more general TFs bind sequencially the promoter. The sixth, TFIIA, is not required for basal transcription, but can stabilize the complex.

So, the initiation complex, bound to the promoter, is essentially made by RNA Pol II (or other) + the general TFs.

The following is a good review of the assembly of the initiation complex at the promoter:

Structural basis of transcription initiation by RNA polymerase II        (paywall)

I quote here the conclusions of the paper:

Conclusions and perspectives
The initiation of transcription at Pol II promoters is a very complex process in which dozens of polypeptides cooperate to recognize and open promoter DNA, locate the TSS and initiate pre-mRNA synthesis. Because of its large size and transient nature, the study of the Pol II initiation complex will continue to be a challenge for structural biologists. The first decade of work, which started in the 1990s, provided structures for many of the factors involved and several of their DNA complexes. The second decade of research provided structural information on Pol II complexes and led to models for how general transcription factors function. Over the next decade, we hope that a combination of structural biology methods will resolve many remaining questions on transcription initiation, and elucidate the mechanism of promoter opening and initial RNA synthesis, the remodelling of the transient protein–DNA interactions occurring at various stages of initiation, and the conformational changes underlying the allosteric activation of initiation and the transition from initiation to elongation. Important next steps include more detailed structural characterizations of TFIIH and the 25-subunit coactivator complex Mediator, not only in their free forms but also as parts of initiation complexes.

For the Mediator, see next section.

And here is another good paper about that:

Zooming in on Transcription Preinitiation

which has a very good Figure summarizing it:

Fig. 3: From Kapil Gupta, Duygu Sari-Ak, Matthias Haffke, Simon Trowitzsch, Imre Berger: Zooming in on Transcription Preinitiation, https://doi.org/10.1016/j.jmb.2016.04.003  Creative Commons license

Transcription PIC. Class II gene transcription is brought about by (in humans) over a hundred polypeptides assembling on the core promoter of protein-encoding genes, which then give rise to messenger RNA. A PIC on a core promoter is shown in a schematic representation (adapted from Ref. [5]). PIC contains, in addition to promoter DNA, the GTFs TFIIA, B, D, E, F, and H, and RNA Pol II. PIC assembly is thought to occur in a highly regulated, stepwise fashion (top). TFIID is among the first GTFs to bind the core promoter via its TBP subunit. Nucleosomes at transcription start sites contribute to PIC assembly, mediated by signaling through epigenetic marks on histone tails. The Mediator (not shown) is a further central multiprotein complex identified as a global transcriptional regulator. TATA, TATA-box DNA; BREu, B recognition element upstream; BREd, B recognition element downstream; Inr, Initiator; DPE, Down-stream promoter element.

 

The Mediator complex

The Mediator complex is the third “general” component of transcription initiation, together with RNA Pol II and the general TFs. However, it is a really amazing structure for many specific reasons.

  • First of all, it is really, really complex. It is a multiprotein structure which, in metazoan, is composed of about 25 different subunits, while it is slightly “simpler” in yeast (up to 21 subunits). Here is a very simplified scheme of the structure:

 

Fig. 4:  Diagram of mediator with cyclin-dependent kinase module.  By original figure: Tóth-Petróczy Á, Oldfield CJ, Simon I, Takagi Y, Dunker AK, Uversky VN, et al.editing: Dennis Pietras, Buffalo, NY, USA [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons  https://commons.wikimedia.org/wiki/File:Mediator4TC.jpg

 

  • Second, and most important, is the fact that, while it is certainly a “general” factor, because it is involved in the transcription of almost all genes, its functions remain still poorly understood, and it is very likely that it works as an “integrration hub” which transmits and modulates many gene-specific signals (for example, those from specific TFs) to the initiation complex. In that sense, the name “mediator” could not be more appropriate: a structure which mediates between the general complex transcription mechansim and the even more complex regulatory signals coming from the enhancer-specific TFs network, and probably from other sources.
  • Third, this seems to be an essentially eukaryotic structure, while RNA POL II, TFs, promoters and enhancers, while reaching their full complexity only in eukaryotes, are in part based on functions already present in prokaryotes. The proteins that make the Mediator structure seem to be absent in prokaryotes (as far as I can say, I have checked only a few of them). Moreover,  many of them show a definite information jump in vertebrates, as we have seen in important regulatory proteins.

Fig. 5 shows, for example, the evolutionari history of 4 of the biggest proteins in the Mediator complex, in terms, as usual, of human conserved information. The big information jump in vertebrates is evident in all of them.

Fig. 5

Here is a paper (2010) about Mediator and its functions:

The metazoan Mediator co-activator complex as an integrative hub for transcriptional regulation

Abstract:

The Mediator is an evolutionarily conserved, multiprotein complex that is a key regulator of protein-coding genes. In metazoan cells, multiple pathways that are responsible for homeostasis, cell growth and differentiation converge on the Mediator through transcriptional activators and repressors that target one or more of the almost 30 subunits of this complex. Besides interacting directly with RNA polymerase II, Mediator has multiple functions and can interact with and coordinate the action of numerous other co-activators and co-repressors, including those acting at the level of chromatin. These interactions ultimately allow the Mediator to deliver outputs that range from maximal activation of genes to modulation of basal transcription to long-term epigenetic silencing.

Fig. 2 in the paper gives a more detailed idea of the general structure of the complex, with its typical section, head, middle, tail, and accessories.

This more recent paper (2015) is a good review of what is known about the Mediator complex, and strongly details the evidence in favor of its key role in integrating regulation signals (especially from enhancers and specific TFs) and delivering those signals to the initiation complex.

The Mediator complex: a central integrator of transcription

In Box 3 of that paper you can find a good illustration of the pre-initiation complex, including Mediator. Fig. 3 is a simple summary of the main actors in transcription, and it introduces also thet looping created by the interaction between enhancers/specific TFs on one part, and promoter/initiation complex on the other, that we are going to discuss next. It also introduces another important actor, cohesin, which will also be discussed.

Finally, this very recent paper (2018) is an example of the functional relevance of Mediator, as shown by its involvement in human neurologic diseases:

The power of the Mediator complex-Expanding the genetic architecture and phenotypic spectrum of MED12-related disorders.

Abstract:

MED12 is a member of the large Mediator complex that controls cell growth, development, and differentiation. Mutations in MED12 disrupt neuronal gene expression and lead to at least three distinct X-linked intellectual disability syndromes (FG, Lujan-Fryns, and Ohdo). Here, we describe six families with missense variants in MED12 (p.(Arg815Gln), p.(Val954Gly), p.(Glu1091Lys), p.(Arg1295Cys), p.(Pro1371Ser), and p.(Arg1148His), the latter being first reported in affected females) associated with a continuum of symptoms rather than distinct syndromes. The variants expanded the genetic architecture and phenotypic spectrum of MED12-related disorders. New clinical symptoms included brachycephaly, anteverted nares, bulbous nasal tip, prognathism, deep set eyes, and single palmar crease. We showed that MED12 variants, initially implicated in X-linked recessive disorders in males, may predict a potential risk for phenotypic expression in females, with no correlation of the X chromosome inactivation pattern in blood cells. Molecular modeling (Yasara Structure) performed to model the functional effects of the variants strongly supported the pathogenic character of the variants examined. We showed that molecular modeling is a useful method for in silico testing of the potential functional effects of MED12 variants and thus can be a valuable addition to the interpretation of the clinical and genetic findings.

By the way, Med12 is one of the 4 proteins shown in Fig. 5 in this OP: it is 2177 AAs long, and exhibits a huge information jump in vertebrates.

 

Specific TFs

OK, let’s abandon, for the moment, the promoter and its initiation complex, and consider what happens at the distant enhancer site. Here, in some apparently unrelated place in the genome, which can be even 1 Mbp away, sometimes even on other chromosomes, the enhancer/specific TFs interaction takes place.

Now, we have already seen the general TFs that work at the promoter site. However fascinating, they are 6 in total (in metazoa).

But what about specific TFs?

Specific TFs are the molecules that are the true center of transcription regulation: they are the main regulators, even if of course they act together with all the other things we have described and are going to describe.

Here is a very recent reciew (2018):

The Human Transcription Factors

Abstract:

Transcription factors (TFs) recognize specific DNA sequences to control chromatin and transcription, forming a complex system that guides expression of the genome. Despite keen interest in understanding how TFs control gene expression, it remains challenging to determine how the precise genomic binding sites of TFs are specified and how TF binding ultimately relates to regulation of transcription. This review considers how TFs are identified and functionally characterized, principally through the lens of a catalog of over 1,600 likely human TFs and binding motifs for two-thirds of them. Major classes of human TFs differ markedly in their evolutionary trajectories and expression patterns, underscoring distinct functions. TFs likewise underlie many different aspects of human physiology, disease, and variation, highlighting the importance of continued effort to understand TF-mediated gene regulation.

The paper is paywalled, but for those who can access it, I would really recommend to read it.

TFs are a very deep subject, so I will just list a few points about them that seem particularly relevant here:

  • TFs are medium sized molecules. Median length in humans, for a set of 1613 TFs derived from the paper quoted above, is 501 AAs, and 50% of those TFs are in the 365-665 AAs range.
  • They are highly modular objects. In essence, almost all TFs are made of at least two components:
    • A highly conserved, well recognizable domain, called the DNA binding comain (DBD), which interacts with specific, short DNA motifs (usually 6-12 nucleotides).
      1. DBDs can be rather easily recognized and classified in families. There are about 100 known eukaryotic DBD types. Almost all known TFs contain at least one DBD, sometimes more than one. The most represented DBD families in humans are C2H2 zinc finger (more than 700), homeodomain (almost 200) and bHLH (more than 100). DBD domains are often rather short AA sequences: zinc fingers, for example, are about 23 AAs long (but they are usually present in multiple copies in the TF), while bHLH is about 50 AAs long, and homeodomains are about 60 AAs long. As said, they are usually old and very conserved sequences.
      2. DNA motifs are short nucleotide sequences (6-12 nucleotides), spread all over the genome. In total, over 500 motif specificity groups are present in humans. However, motifs are not at all specific or sufficient in determinining TF binding, and many other factors must cooperate to achieve and regulate the actual binding of a TF to a DNA motif.
    • At least one other sequence, which is usually longer and does not contain recognizable domains. These sequences are often highly disordered, are less conserved, and may have important regulatory functions. In some cases, other specific domains are present: for example, in the family of nuclear receptors, the TF shows, together with the DBD and the non domain sequence, a ligand domain which interacts with the hormone/molecule that conveys the signal.
  • There are a lot of them. The above quoted paper, probably the most recent about the issue, gives a total of 1639 proteins that are known or likely TFs in humans, but the list is almost certainly not complete. It is very likely that there are about 2000 TFs in humans, which is about 10% of protein coding genes. Of course, all these are specific TFs (except for the 6 general TFs mentioned earlier). So, this is probably the biggest regulatory network in the cell.
  • The way they work is still poorly understood, except of course for the DNA binding. It is rather certain that they usually work in groups, combinatorially, and by recruiting also other (non TF) proteins or molecules. The above mentioned paper lists many possible mechanisms of action and regulation for TF activity:
    1. Cooperative binding: TFs often aid each other in binding to DNA: that can imply also forming homodimers or higher order structures.
    2. Interaction and competition with nucleosomes, in some cases by recruiting ATP-dependent chromatin remodelers and other TFs
    3. Recruiting of cofactors (‘‘coactivators’’ and ‘‘corepressors’’) which are frequently large multi-subunit protein complexes or multi-domain proteins that regulate transcription via several mechanisms. The ligand-binding domains of nuclear hormone receptor subclass of TFs. already quoted before, are a special case of that.
    4. Exploiting unstructured regions and/or DBDs to interact with cofactors
    5. It is also wrong to classify individual TFs as “activators” or “repressors” I quote from the paper: Because effects on transcription are so frequently context dependent, more precise terminology may be warranted, in general— for example, reflecting the biochemical activities of TFsand their cofactors. On a global level, however, there is no comprehensive catalog of cofactors recruited by TFs. Moreover, the biochemical functions required for gene activation orcommunication between enhancers and promoters remain largely unknown
  • As a class, their evolutionary history in terms of human conserved information is well comparable to thne mean pattern of the whole human genome. In particular, they do not ehibit, as a class, any special information jump in vertebrates (mean = 0.293 baa in TFs vs 0.288 baa in the whole human proteome).

Fig. 6 shows the mean evolutionary history of 1613 human TFs, in terms of baa of human conserved information, as compared to the mean values for the whole human proteome:

 

Fig. 6

 

So, in brief: one of more specific TFs bind some specific enhancer in some part of the genome, and the specific big structure at the enhancer (enhancer + specific TFs + cofactors) in some way binds the general big structure at the promoter (promoter + RNA Pol II + general TFs + Mediator), and, probably acting on the Mediator complex, regulates the activity of the RNA polymerase and therefore the rate of transcription.

The interaction between a distant enhancer and the promoter has one important and immediate consequence: the chromatin fiber bends, and forms a specific loop (Fig. 7):

 

Fig. 7: Diagram of gene transcription factors

By Kelvin13 [CC BY 3.0 (https://creativecommons.org/licenses/by/3.0)], from Wikimedia Commons    https://commons.wikimedia.org/wiki/File:Transcription_Factors.svg

And, just as a final bonus about trans regulation of transcription, guess what is implied too? Of course, long non coding RNAs! See here:

Noncoding RNAs: Regulators of the Mammalian Transcription Machinery

Abstract
Transcription by RNA polymerase II (Pol II) is required to produce mRNAs and some noncoding RNAs (ncRNAs) within mammalian cells. This coordinated process is precisely regulated by multiple factors, including many recently discovered ncRNAs. In this perspective, we will discuss newly identified ncRNAs that facilitate DNA looping, regulate transcription factor binding, mediate promoter-proximal pausing of Pol II, and/or interact with Pol II to modulate transcription. Moreover, we will discuss new roles for ncRNAs, as well as a novel Pol II RNA-dependent RNA polymerase activity that regulates an ncRNA inhibitor of transcription. As the multifaceted nature of ncRNAs continues to be revealed, we believe that many more ncRNA species and functions will be discovered.

 

Finally, we have to consider the role of chromatin states.

Chromatin states and epigenetics

Chromatin accessibility

For all those things to happen, one condition must be satisfied: the DNA sequences implied, IOWs the gene, promoter and specific enhancers, must be reasonably accessible.

The point is that chromatin in interphase is in different states and different 3D configurations and different spacial distributions in the nucleus, especially in relation to the nuclear lamina. In general, heterochromatin is the condensed form, functionally inactive, and is mainly associated with the nuclear lamina (the perifephery), while euchromatin, the lightly packed  and transcriptionally active form, with its trancriptional loops, is more in the center of the nucleus.

However, things are not so simple: chromatin states are not a binary condition (heterochromatin/euchromatin), and they are extremely dynamic: the general map of chromatin states is different from cell to cell, and in the same cell from time to time.

One way to measure chromatin accessibility (IOWs, to map what parts of the genome are accessible to transcription in a cell at a certain time) is to use a test that directly binds or marks in some way the accessible regions. There are many such tests, and the most commonly used are DNase-seq (DNase I cuts only at the level of accessible chromatin) and ATAC-seq (insertions by the Tn5 transposon are restricted to accessible chrmatin). ATAC-seq has also been applied at the single cell level, and the results are described in this wonderful paper:

A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility

Summary:
We applied a combinatorial indexing assay, sci-ATAC-seq, to profile genome-wide chromatin accessibility in ∼100,000 single cells from 13 adult mouse tissues. We identify 85 distinct patterns of chromatin accessibility, most of which can be assigned to cell types, and ∼400,000 differentially accessible elements. We use these data to link regulatory elements to their target genes, to define the transcription factor grammar specifying each cell type, and to discover in vivo correlates of heterogeneity in accessibility within cell types. We develop a technique for mapping single cell gene expression data to single-cell chromatin accessibility data, facilitating the comparison of atlases. By intersecting mouse chromatin accessibility with human genome-wide association summary statistics, we identify cell-type-specific enrichments of the heritability signal for hundreds of complex traits. These data define the in vivo landscape of the regulatory genome for common mammalian cell types at single-cell resolution.
That shows howt cell states and cell types can be well differentiated by mapping chromatin accessibility.

Epigenetic states

But what makes different parts of chromatin more or less accessible?
The answer is: epigenetic regulations.
I will discuss very briefly DNA methilation, which takes place at cytosine or adenine, but mainly at cytosine when it is followed by a guanine (so called CpG dinucleotide). The subject is very complex, and I will not go into details. Suffice it to say that methylation at CpGs has usually a repressive effect on DNA (IOWs, methylated DNA is not active). The following figure shows how unmethylated CpG islands are usually found at promoters or other active regions, while methylated CpGs correspond to inactive segments, for example inactivated transposable elements.
Fig. 8  DNA methylation landscape   By Mariuswalter [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], from Wikimedia Commons https://commons.wikimedia.org/wiki/File:DNAme_landscape.png
Let’s go now to post-transcriptional modifications (PTMs) of histones. This is certainly the moast relevant epigenetic level of transcription regulation.
In brief, histones have “tails” that can be modified by attaching various kinds of groups to them. So, each of the four histone types in the nucleosome (usually H2A, H2B, H3 and H4) can be methylated or polymethilated, acetylated, phosphorilated, ubiquinated, sumoylated, biotinylated and many other things, at different aminoacid sites, usually lysines or arginines. The combinatorial result is that more than 150 different histone PTMs have been described.
However, methilations and acetylations are the most studied. Histone H3 is the most involved in current studies, and methylations and acetylations are the modifications that have been better analyzed..
The term “histone code” refers in a general way to the sum total of these different modifications and of their effects on chromatin state and transcription. However, many aspect of these complex processes are still poorly understood.
In general, the best known PTMs are classified as having an effect of transcription activation or repression. Acetylations are mostly activating, methylation can be either activating or repressing. A good table of the main histone PTMs can be found here:
Now, some of these modifications have been mapped genome-wide in different types of cells. Their combinatorial aggregation can rather well predict different functional states of different parts of the genome and of chromatin.
For example, the following paper:

uses 9 different histone marks, 5 methilations and 2 acetylations of histone H3, 1 methylation of histone H4 andthe mapping of CTCF (see later) to map 30 different states in the genome of 3 different types of human cells. For example, you can see in Figure 2a the 30 states (N1-30) and the 14 known transcriptional states that they are related to (the color code on the left). So, for example, state N8, which corresponds to the brown color code of “poised enhancer“, is marked by high expression of H3K27me3 (IOWs trimethylation of lysine 27 on histone H3) and low expression of H4K20me1 (IOWs monomethylation of lysine 20 on histone H4). The first modification has a meaning of transcriptional repression, while the second is a marker of transcriptional activation. IOWs, these nucleosomes are pre-activated, but “poised”. A similar situation can be observed in state N7, corresponding to “bivalent promoter“, where the repressive mark of H3K27me3 is associated to mono, di and trimethylation of lysine 4, always on histone H3, which are activating signals. These bivalent conditions, both for promoters and enhancers, are usually found in stem cells, where many genes are in a “pre-activated state”, momentarily blocked by the repressive signal, but ready to be activated for differentiation.

This is just to give an idea. So, this kind of analysis can well predict some of the results of the already mentioned Chromatin accessibility tests, and is also well related to the investigation of chromatin 3D configurations, which we will discuss in next section.

The following video is a good and simple review of the main aspects of the histone code.

 

 

 

But how are these histone modifications achieved?

Again, each of them is the result of very complex pathways, many of them still poorly understood.

For example, H3K4me3, one of the main activating marks, is achieved by a very complex multi-protein complex, involving at least 10 different proteins, some of them really big (for example, MLL2, 5537 AAs long). Moreover, the different pathways that implement different marks obviously exhibit complex crosstalks, creating intricate networks. Moreover, those pathways are not only writers of histone marks, but also readers of them: indeed, the modifictions effected are always determining by the reading of already existing modification. And, of course, there ae also eraser proteins.

All these concepts are dealed in some detail in the following paper:

The interplay of histone modifications – writers that read

A final and important question is: how do histone modifications implement their effects, IOWs the chromatin modifications that imply activation or repression of the genes? Unfortunately, this is not well understood. But:

  • For some modificaions, especially acetylation, part of the effect can probably be ascribed to the direct biochemiacal effect of the modification on the histone itself
  • Most effects, however, are probably implemented thorugh the recruitement  by the histone modification, often in combinatorial manner, of other “reader” proteins, who are responsible, directly or indirectly, of the activation or repression effect

The second modality is the foundation for the concept of histone code: in that sense, histone marks work as signals of a symbolic code, whose effects in most cases are mediated by complex networks of proteins which can write, read or erase the signals.

 

3D configuration of Chromatin

As said, one the final effects of epigenetic markers, either DNA methylation or histone modifications, is the chenge in 3D configuration of chromatin, which in turn is related to chromatin accessibility and therefore to transcription regulation.

This is, again, e very deep and complex issue. There are specific techniques to study chromatin configuration in space, which are independent from the mapping of chromatin accessibility and of epigenetic markers that we have already discussed. The most used are chromosome conformation capture(3C) and genome-wide 3C(Hi-C). Essentially, these techniques are based on specific procedures of fixation and digestion of chromatin that preserve chromatin loops and allow to analyze them and therefore the associations between distant genomic sites (IOWs, enhancer promoter associations) in specific cells and in specific cell states.

Again to make it brief, chromatin topology depends essentially on at least two big factors:

  • The generation of specific loops throughout the genome because of enhancer-promoter associations
  • The interactions of chromatin with the nuclear lamina

As a result of those, and other, factors, chromatin generates different levels of topologic organiazion, which can be described, in a very gross simplification, as follows, going from simpler to more complex structures:

  1. Local loops
  2. Topologically associating domains (TADs): This are bigger regions that delimit and isolate sets of specific interation loops. They can correspond to the idea of isolated “trancription factories”. TADs are separated, at genomic level, by specific insulators (see later)
  3. Lamina associated domains (LADs and Nucleolus associated domains (NADs): these correspond usually to mainly inactive chromatin regions
  4. Chromosomal territories, which are regions of the nucleus preferentially occupied by particular chromosomes
  5. A and B nuclear compartments: at higher level, chromatin in the nucleus seems to be divied into two gross compartments: the A compartment is mainly formed by active chrmain, the B compartment by repressed chromatin

Figure 9 shows a simple representation of some of these concepts.

 

Fig. 9  A graphical representation of an insulated neighborhood with one active enhancer and gene with corresponding enhancer-gene loop and CTCF/cohesin anchor loop. By Angg!ng [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], from Wikimedia Commons   https://commons.wikimedia.org/wiki/File:InsulatedNeighborhood.svg

The concept of TAD is particularly interesting, because TADs are insulated units of transcription: many different enhancer-promoter interactions (and therefore loops) can take place inside a TAD, but not usually between one TAD and another one. This happens because TADs are separated by strong insulators.

A very good summary about TADs can be found in the following paper:

Minor Loops in Major Folds: Enhancer–Promoter Looping, Chromatin Restructuring, and Their Association with Transcriptional Regulation and Disease

This is taken from Fig. 1 in that paper, and gives a good idea of what TADs are:

 

 

 

Fig. 10: Structural organization of chromatin
(A) Chromosomes within an interphase diploid eukaryotic nucleus are found to occupy specific nuclear spaces, termed chromosomal territories.
(B) Each chromosome is subdivided into topological associated domains (TAD) as found in Hi-C studies. TADs with repressed transcriptional activity tend to be associated with the nuclear lamina (dashed inner nuclear membrane and its associated structures), while active TADs tend to reside more in the nuclear interior. Each TAD is flanked by regions having low interaction frequencies, as determined by Hi-C, that are called TAD boundaries (purple hexagon).
(C) An example of an active TAD with several interactions between distal regulatory elements and genes within it.

Source: Matharu, Navneet (2015-12-03). “Minor Loops in Major Folds: Enhancer–Promoter Looping, Chromatin Restructuring, and Their Association with Transcriptional Regulation and Disease“. PLOS Genetics 11 (12): e1005640. DOI:10.1371/journal.pgen.1005640PMID 26632825PMCPMC4669122ISSN 1553-7404.

Author: Navneet Matharu, Nadav Ahituv

By Navneet Matharu, Nadav Ahituv [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

There is, of course, a good correlation between the three types of anaysis and genomic mapping that we have described::

  • Chromatin accessibility mapping
  • Epigenetic marks
  • Chromatin topology studies

However, these three approaches are different, even if strongly related. They are not measuring the same thing, but different things that contribute to the same final scenario.

 

CTCF and Cohesin

But what are these insulators, the boundaries that separate TADs one from another?

While the nature of insulators can be complex and varies somewhat from species to species, in mammals the main proteins responsible for that function are CTCF and cohesin.

CTCF is indeed a TF, a zinc finger protein with repressive functions. While it has other important roles, it is the major marker of TAD insulators in mammals. It is 727 AAs long in humans, and its evolutionary history shows a definite information jump in vertebrates (0.799 baa, 581 bits) as shown in Fig. 5, which is definitely uncommon for a TF.

 

We have already encountered CTCF as one of the epigenetic markers used in histone code mapping. Its importance in transcription regulation and in many other important cell functions cannot be overemphasized.

Cohesin is a multiprotein complex which forms a ring around the double stranded DNA, and contributes to a lot of important stabilizations of the DNA fiber in different situations, especially mitosis and meiosis. But we know now that it is also a major actor in insulating TADs, as can be seen in Fig 4, and in regulating chromatin topology. Cohesin and its interacting proteins, like MAU2 and NIPBL, are a fascinating and extremely complex issue of their own, so I just mention them here because otherwise this already too long post would become unacceptably long. However, I suggest here a final, very recent review about these issues, for those interested:

Forces driving the three‐dimensional folding of eukaryotic genomes

Abstract:

The last decade has radically renewed our understanding of higher order chromatin folding in the eukaryotic nucleus. As a result, most current models are in support of a mostly hierarchical and relatively stable folding of chromosomes dividing chromosomal territories into A‐ (active) and B‐ (inactive) compartments, which are then further partitioned into topologically associating domains (TADs), each of which is made up from multiple loops stabilized mainly by the CTCF and cohesin chromatin‐binding complexes. Nonetheless, the structure‐to‐function relationship of eukaryotic genomes is still not well understood. Here, we focus on recent work highlighting the biophysical and regulatory forces that contribute to the spatial organization of genomes, and we propose that the various conformations that chromatin assumes are not so much the result of a linear hierarchy, but rather of both converging and conflicting dynamic forces that act on it.

 

Summary and Conclusions

So this is the part where I should argue about how all the things discussed in this OP do point to design. Or maybe I should simply keep silent in this case. Because, really, there should be no need to say anything.

But I will. Because, you know, I can already hear our friends on the other side argue, debate, or just suggest, that there is nothing in all these things that neo-darwinism can’t explain. They will, they will. Or they will just keep silent.

So, I will briefly speak.

First of all, a summary of what has been said. I will give it as a list of what really happens, as far as we know, each time that a gene starts to be transcribed in the appropriate situation: maybe to contribute to the differentiation of a cell, maybe to adjust to a metabolic challenge, or to anything else.

  • So, our gene was not transcribed, say, “half an hour ago”, and now it begins to be transcribed. What has happened ot effect this change?
  • As we know, first of all some specific parts of DNA that were not active “half an hour ago” had to become active. At the very least, the gene itself, its promoter, and one appropriate enhancer.  Therefore, some specific condition of the DNA in those sites must have changed: maybe through changes in histone marks, maybe through chromatin remodeling proteins, maybe through some change in DNA methylation, maybe through the activity of some TF, or some multi-protein structure made by TFs or other proteins, maybe in other ways. What we know is that, whatever the change, in the end it has to change some aspects of the pre-existing chromatin state in that cell: chromatin accessibility, nucleosome distribution, 3D configuration, probably all of them. Maybe the change is small, but it must be there. In our Fig. 2 (at the beginning of this long post) the red arrows are therefore acting from left to right, to effect a transition from state 1 to state 2.
  • So, the appropriate DNA sequences are now accessible. What happens then?
  • At the promoter, we need at least that the multiprotein structure formed by our 6 general TFs and the multiprotein structure that is RNA Pol II bind the promoter. See Figure 3.
  • Always at the promoter, the huge multiprotein structure which is the Mediator complex must join all the rest. See Figure 4.
  • At the enhancer, one or more specific TFs must bind the appropriate motif by the appropriate DBD, interact one with the other, recruit possible co-factors.
  • At this point, the structure bound at the enhancer must interact with the distant structure at the promoter, probably through the Mediator complex, generating a new chromatin loop, usually in the context of the same TAD. see Fig. 7.
  • So, now the 3D configuration of chromatin has changed, and transcription can start.
  • But as the new protein is transcribed, and then probably translated (through many further intermediate regulation steps, of course, like the Spliceosome and all the rest), the transcriptome/proteome is changing too. In many cases, that will imply changes in factors that can act on chromatin itself, for example if the new protein is a TF, or any other protein implied directly or indirectly in the above described processes, or even if it can in some way generate new signals that will in the end act on transcription regulation. Maybe the change is small, but it must be there. In our Fig. 2 (at the beginning of this long post) the red arrows are now probably acting from right to left, possibly initiating a transition from state 2 to state 3.
  • After all, that is what must have happened at the beginning of this sequence, when some new condition in the transcriptome/proteome started the transcription of our new protein.

And now, a few considerations:

  • This is just an essential outline: what really happens is much, much more complex
  • As we have seen, the working of all this huge machinery requires a lot of complex and often very specific proteins. First of all the 2000 specific TFs, and then the dozens, maybe hundreds, of proteins that implement the different steps. Many of which are individually huge, often thousands of AAs long.
  • The result of this machinery and of its workings is that thousands of proteins are transcribed and translated smoothly at different times and in different cells. The result is that a stem cell is a stem cell, a hepatocyte a hepatocyte and a lymphocyte a lymphocyte. IOWs, the miracle of differentiation. The result is also that liver cells, renal cells, blood cells, after having differentiated to their “stable” state, still perform new wonders all the time, changing their functional states and adapting to all sorts of necessities. The result is also that tissues and organs are held together, that 10^11 neurons are neatly arranged to perform amazing functions, and so on. All these things rely heavily on a correct, constant control of transcription in each individual cell.
  • This scenario is, of course, irreducibly complex. Sure, many individual components could probably be shown not to be absolutely necessary for some rough definition of function: transcription can probably initiate even in the absence of some regulatory factor, and so on. But the point is that the incredibly fine regulation of the whole process, its management and control, certainly require all or almost all the components that we have described here.
  • Beyond its extraordinary functional complexity, this regulation network also uses at its very core at least one big sub-network based on a symbolic code: the histone code. Therefore, it exhibits a strong and complex semiotic foundation.

So, the last question could be: can all this be the result of a neo-darwinian process of RV + NS of simple, gradual steps?

That, definitely, I will not answer. I think that everybody already knows what I believe. As for others, everyone can decide for themselves.

 

 

 

 

PS: Here is a scatterplot of some values of functional information obtained by my method as compared to the values given by Durston, as per request of George Castillo. As can be seen, the correlation is quite good, even with all the difficulties in comparing the two methods, that are quite different under many aspects. However, my method definitely underestimates functional information as compared to Durston’s (or vice versa).

 

PPS:  More graphs added as per request of George Castillo. The explanation in in comment #270.

 

 

 

 

335 Replies to “Transcription regulation: a miracle of engineering

  1. 1
  2. 2
    gpuccio says:

    Hi, UB! 🙂

    This OP originated form an interesting discussion in this older thread:

    https://uncommondescent.com/intelligent-design/chromatin-topology-the-new-and-latest-functional-complexity/

    in some way originated by comments made by George Castillo.

    I hope we can continue some discussion here with those interested.

  3. 3
    kairosfocus says:

    Yet another GP tour de force.

  4. 4
    gpuccio says:

    Hi KF! 🙂

    Always good to hear from you.

  5. 5
    gpuccio says:

    UB:

    I hope you appreciate the part about the histone code! 🙂

    It’s interesting that the ubiquitin code, that, as you know, I have discussed elsewhere:

    https://uncommondescent.com/intelligent-design/the-ubiquitin-system-functional-complexity-and-semiosis-joined-together/

    and the histone code that I discuss here have some special overlap, and probably crosstalk.

    Indeed, ubiquitination is one of the histone marks involved in the histone code.

    Here is a paper about that:

    Histone Ubiquitination: Triggering Gene Activity

    https://www.sciencedirect.com/science/article/pii/S1097276508001330

    Summary
    Recently, many of the enzymes responsible for the addition and removal of ubiquitin from the histones H2A and H2B have been identified and characterized. From these studies, it has become clear that H2A and H2B ubiquitination play critical roles in regulating many processes within the nucleus, including transcription initiation and elongation, silencing, and DNA repair. In this review, we present the enzymes involved in H2A and H2B ubiquitination and discuss new evidence that links histone ubiquitination to other chromatin modifications, which has provided a model for the role of H2B ubiquitination, in particular, in transcription initiation and elongation.

    For example, for ubiquitination of Lys 120 in histone H2B, an activating marker, a very specific E3 ligase is required:

    RNF20/40 E3 ubiquitin-protein ligase complex

    a heterodimer of about 2000 AAs. The interesting thing is that the same protein seems to be a “prerequisite” for specific methylations of histone H3, as can be read in Uniprot:

    RNF20
    Function
    Component of the RNF20/40 E3 ubiquitin-protein ligase complex that mediates monoubiquitination of ‘Lys-120’ of histone H2B (H2BK120ub1). H2BK120ub1 gives a specific tag for epigenetic transcriptional activation and is also prerequisite for histone H3 ‘Lys-4’ and ‘Lys-79’ methylation (H3K4me and H3K79me, respectively). It thereby plays a central role inb histone code and gene regulation. The RNF20/40 complex forms a H2B ubiquitin ligase complex in cooperation with the E2 enzyme UBE2A or UBE2B; reports about the cooperation with UBE2E1/UBCH are contradictory. Required for transcriptional activation of Hox genes. Recruited to the MDM2 promoter, probably by being recruited by p53/TP53, and thereby acts as a transcriptional coactivator. Mediates the polyubiquitination of isoform 2 of PA2G4 in cancer cells leading to its proteasome-mediated degradation.

    Crosstalks, crosstalks…

  6. 6
    OLV says:

    Wow!
    Here he goes again!
    🙂

  7. 7
    gpuccio says:

    Hi OLV,

    good to see you again! 🙂

  8. 8
    PeterA says:

    A heavy-duty biology jewel indeed!

  9. 9
    gpuccio says:

    Thank you, Peter. 🙂

    It required a little bit of work, but it was very rewarding. Writing an OP is always a great occasion to delve more deeply into these fascinating issues!

  10. 10
    jawa says:

    Delightfully written.
    The “boring simplicity” revealed by the leading edge biology research these days has been publicly unveiled here by gpuccio.
    Well done!

  11. 11
    PeterA says:

    jawa,
    Yes, that’s exactly right.
    This OP is much more insightful than some papers I’ve seen in prestigious peer-review journals. Definitely a university textbook material.

  12. 12
    PeterA says:

    gpuccio,
    Thank you!
    I’m studying this excellent product of your dedicated work.

  13. 13
    PeterA says:

    BTW, it’s funny that George Castillo deserves some credits for inspiring -at least partially- the author of this OP.

  14. 14
    PaoloV says:

    This article is too technical for my poor knowledge of the topic, but the title contains a word that is not “politically correct” these days.
    May God bless gpuccio.

  15. 15
    gpuccio says:

    PeterA, jawa, PaoloV:

    Thank you for your beautiful words.

    You have been a great source of inspiration to me, with your contributions to the discussion in the previous thread where the discussion began.

    And yes, I agree: George Castillo deserves some credits too! 🙂

  16. 16
    jawa says:

    Paolo,
    Actually, the title contains two terms that could be unacceptable in certain circles. It’s widely known that the appearance of “engineering” within biology is only an illusion. Obviously the “m” word is totally “out of context” but perhaps gpuccio didn’t understand George Castillo’s comments in the older discussion. 🙂

  17. 17
    jawa says:

    gpuccio,
    I appreciate your tremendous dedication to help others understand this fascinating but very difficult area of science. I’m sure other folks agree with me in this.

  18. 18
    OLV says:

    Hey guys,
    Let’s get to work.
    There’s plenty of very interesting material in this OP that is suitable for discussion.
    Let’s read it carefully and ask any questions about the content.

  19. 19
    PeterA says:

    OLV,
    Good point. Totally agree. Thanks.

  20. 20
    gpuccio says:

    Guys,

    I woud like to quote here one really amazing statement made by George Castillo in the thread where the discussion originated.

    Here:

    Chromatin Topology: the New (and Latest) Functional Complexity

    https://uncommondescent.com/intelligent-design/chromatin-topology-the-new-and-latest-functional-complexity/

    at comment #182, he said:

    For starters, gpuccio, chromatin certainly is DNA wrapped around histones.
    It is the base unit of our genomic storage and how the vast majority of our DNA is packaged.
    99.999% of our genome at any given moment is wrapped around histones in the 10 nanometer beads-on-a-string.
    Nothing beyond that has been shown to exist in vivo (during interphase).

    (Emphasis mine)

    I disagree.

    A lot beyond that has been shown to exist in vivo, during interphase. This OP is of course an attempt to show that.

    I have often seen reductionism and denial, but the above statement, IMO, really is the ultimate!

  21. 21
    jawa says:

    gpuccio,
    The statement you quoted disregards the available evidences.
    This new OP discredits that quoted statement even further.

  22. 22

    Hello GP, long day, I am just now able to sit down and take in your OP. Thank you dearly for writing it.

  23. 23
    gpuccio says:

    By the way, George Castillo just resurfaced in the old thread to add some new nonsense.

    Being probably a little self-destructive, I invited him to come here, read and, if he likes, comment!

  24. 24
    Eugene S says:

    Wow, GP!

    If you don’t mind, a question slightly related to this topic as well since you have pictures of information quantities.

    When you calculate the number of states that can be visited by evolutionary walk, you arrive at a value of 2^140 states. Then you reason about information deltas: if the absolute value of a delta related to some biological change is greater than 140 bits, then we infer design. It is this “then” that is not entirely clear to me.

    Could you expound a bit on how this estimate of 140 bits relates to the methods of determining the functional information quantities in polypeptydes. What is not clear to me in this reasoning is how we connect the dots between these two things: (a) the estimate based on the number of states visitable by evolutionary walk and (b) the methods of calculating functional information is an amino acid sequence.

    E.g. an estimate of the absolute value of information content for the human genome is about 70 MB. Does it mean that any deltas I can get by evolution are within 140 bits on top? What about duplication and recombination? From information theory books, we can get that information gain per generation with sexual reproduction is of the order sqrt(G) where G is the genome size (if I remember rightly). Surely it can be greater than 140 bits.

    Can you see my question?

  25. 25
    gpuccio says:

    To all:

    As usual, I will try to use the discussion to highlight in more detail some recent aspects of what has been discussed in the OP.

    Let’s start with non coding RNAs, a subject that is always interesting and important from an ID perspective. The ex-jink, let’s say. 🙂

    I have briefly mentioned in the OP (at the end of the “Specific TFs section) their newly discovered roles in transcription regulation.

    Well, this paper is brand new (July 2018):

    Emerging Roles of Non-Coding RNA Transcription

    https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(18)30104-X?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS096800041830104X%3Fshowall%3Dtrue

    Highlights:

    The location of ncRNAs with respect to target genes is more highly conserved than the ncRNA sequence, suggesting that position-specific cis effects are driving ncRNA evolution.

    Recent studies indicate that it is often the act of transcription, rather than the sequence or nature of the ncRNA product, that acts as a modulator of chromatin accessibility, transcription factor occupancy, and epigenetic state.

    Nascent ncRNAs can interact in a sequence-independent manner with epigenetic regulators and transcription factors bound nearby to influence their retention or catalytic activity.

    Advances in genome-editing and genome-targeting strategies will help to more clearly define the effects of ncRNA transcription on mRNA expression.

    Abstract:

    Metazoan genomes are broadly transcribed by RNA polymerase II (RNAPII), but surprisingly few of these RNAs encode proteins. Accordingly, there is great interest in understanding the origins and potential roles of the vast array of non-coding RNAs (ncRNAs) that are produced. We present here emerging evidence that the act of transcription and the presence of nascent RNA at a locus is often central to function, rather than specific ncRNA sequences or structures. We highlight examples wherein transcription elongation through a regulatory region modulates chromatin structure and/or transcription factor occupancy, and describe how nascent RNA contributes to the local epigenetic landscape through sequence-independent interactions with chromatin regulators. Finally, we discuss current strategies for probing the potential functions of ncRNA transcription.

    Emphasis mine.

    This is definitely a new and interesting concept: that non coding RNAs may act in a location dependent, and sequence and structure independent, modality. That could help explain their low conservation even in the presence of refined function.

    So, non coding RNAs could act as a strange mixture of cis and trans regulatory elements.

    This is not so suprising, after all, if we consider the double nature of promoters and enhancers too, as described in the OP: they certainly act as cis regulatory elements, as DNA sequences, but while they do that they are also transcribed into promoter and engancer associated RNAs, and those RNAs too have direct effects on transcription regulation.

    Definitely, things become more complex (and more interesting) practically with each new day. 🙂

  26. 26
    Eugene S says:

    GP,

    And can I please use my chance to reiterate my request to this blog to create an index by author. GP, I am following your contributions here and it is getting out of control at this end 😉 I’d like to have an index here like on evolutionnews.org instead of having to bookmark stuff in the browser.

    Thanks.

  27. 27
    gpuccio says:

    Eugene S:

    Thank you for the interesting question, which requires a detailed answer.

    “When you calculate the number of states that can be visited by evolutionary walk, you arrive at a value of 2^140 states.”

    That’s correct. Only, remember that this is really a big overestimation, just to be on the safe side. The real number of individual states that can be reached is probably much lower.

    I would like that with the word “state” I mean some specific new genomic configuration, as it can derive from reproduction which involves some genomic variation. The reson for that is that the whole genome, as it emerges from reproduction, is the functional unit that is subject to natural selection, if any.

    “Then you reason about information deltas: if the absolute value of a delta related to some biological change is greater than 140 bits, then we infer design.”

    This is not really correct, if you are referring to absolute information content, as it seems from your following remarks. I believe that this could be your main misunderstanding, so I will try to clarify it better. I apologize in advance if instead the concept was already clear to you.

    The important point is: I never reason in terms of absolute information content, only in terms of functional information. Indeed, all the “jumps” I analyze and discuss are jumps (deltas) in functional information.

    Why? Because I have applied a special procedure, that I have tried to explain in soem detail when possible.

    What I measure is “human conserved information”, IOWs the bits of homology to the human form of the protein. Another way to say it is that I use human proteins as “probes” to measure the evolutionary history of proteins in relation to the form that the protein assumes in humans

    I could as well use as probes the proteins in bees, for example (indeed, I have done that in some cases, for specific comparisons). My choice of huma proteins as “measuring probes” has, however, a few important motivations:

    a) Of course, we are naturally interested in human functions

    b) The human proteome is probably the best investifated and reviewed

    c) Human are a very recent species, so human proteins can be considered, in their final form, a recent result

    d) I am specially interested in the transition to vertebrates, and humans are recent vertebrates. So, the time distance from the originary split to vertebrates (and then from cartilaginous fish to bo0ny fish) and the recent split to humans is more than 400 million years

    So, why is a jump in human conserved information observed after the slit to vertebrates (in my graphs, that is the transition from non vertebrate deuterostomes to cartilaginous fish) a good measure of a variation in functional information?

    Well, the reasoning is simple, and it relies on assumptions that cannot be easily denied, especially by neo-darwinists.

    If some specific sequence appears for the first time after the split to vertebrates, and is then conserved for more than 400 My, then we can safely assume that the sequence is functional. Indeed, the measure that it is conserved (IOWs, the bits of information that appear for the first time in vertebrates and are conserved to humans) is a measure of its functional restraint.

    So, if a protein homolog a some huma protein (let’s call it A) has a maximum homology hit with the human form, before the appearance of vertebrates, of say 300 bits, and then we find 1000 bits of homology in cartilaginous fish, then we can say that 700 bits of human conserved, and therefore functional, information have appeared at the transition to vertebrates. If theprotein A is, say, 1000 AAs long, that is a jump of 0.7 bits per aminoacid site (about one third of the total information content of the protein in a blast comparison).

    The 400 million years gap is important: indeed, it is an evolutionary distance that guarantees that any non functional sequence homology will be completely cancelled by neutral variation, as can be easily seen in synonimous sites. Therefore, any homology that is conserved for such a time is under extremely strong functional constraint.

    So, I hope that I have explaine my “methods of calculating functional information in an amino acid sequence”, your b) question. This is not the only way to do it, of course. The Durston method, that has inspired all my reasonings, is slightly different. However, I find this method quick and reliable.

    The connection with your a) point (“the estimate based on the number of states visitable by evolutionary walk”) is rather direct: if we exclude that natural selection is a relevant factor for comple functions (and the arguments to exclude it are different in nature, and you can find them in my OP:

    What are the limits of Natural Selection? An interesting open discussion with Gordon Davisson

    https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/

    then the only way for a non design system to reach a functional isalnd is to have probabilisitc resources comparable to the functional complexity of that functional island.

    Remember that the functional complexity, measured as described above, is a measure not of the absolute information content, but of the ratio between the target space and the search space, IOWs a measure of the probability to find the target space by a single random search. That’s why blast homologies are directly transformed, in the blast algorithm, into E values, that are a slitly different, but related concept (the expected number of random hits of that level by the type of search done by the algorithm in the available database).

    I am not sure why you say that:

    “an estimate of the absolute value of information content for the human genome is about 70 MB”

    The absolute value of information content for the whole human genome (3 Gbp) is certainly much more than that. The number you give is probably related to protein coding genes only.

    However, as said, the total information content is not relevant in ID: only the functional information counts.

    Duplication is not really an increase in functional information. It is similar to printing two copies of the same book.

    Of course, if duplication in itself generates a new function, then the limited functional information linked to that event should be computed again by dividing the target space (the possible duplications that generate that function) by the search space (all the possible duplication events).

    The same is true for recombinations.

    The important point is that duplications and recombinations reuse existin functional information at the sequence level: they don’t create it. The only new functional information can derive from the new disposition.

    So, let’s say that, in a very simple case, a recombination shifts two sequences in a protein, maybe changing the function. However, using a blast comparison, the sequence homology will not be significantly changed (because blast is a local alignment).

    In the same way, a blast hit of a human protein against a group of organism does not depend on how many copies of that sequence are present in the organism: it just gives the highest homology hit in the group, the single sequence that is most similar to the human one.

    So, neither sexual reproduction nor duplication nor recombination can generate any new functional sequence of more than 140 bits of functional information. Because 140 bits measn that the event has a probability of 1:2^140 to happen, and the probabilistic resources of our biological world just cannot do it.

    Please feel free to ask new questions, if my answers are not clear or sufficient. This is an important point, and it is not at all easily grasped.

  28. 28
    gpuccio says:

    Eugene S:

    I absolutely agree with you that an index by author would be precious. But that is not something that I can do.

    I don’t even know if it would be ab easy task. Maybe Barry, Denyse, or those who work at the maintenance of the site, will consider your request, that has already gained the support of a few people, including me. 🙂

  29. 29
    gpuccio says:

    Eugene S:

    A few more clarifications.

    A recombination, whatever it is, is still only one new state. The same is true for duplications.

    Any random variation, or group of variations, that happens in an ancestor and is transmitted is indeed one new state, be it functional, neutral or deleterious.

    Let’s say that you need to generate a specific sequence of 100 AAs to implement a new function, and that the new function cannot really be implemented with any lower sequence information specificity.

    Well, in theory you can get the right new sequence even in one attempt, for example by a framework mutation. But the probability that one random framewrok mutation may give the correct seqeunce are about 2^430. Even with all the probablistic resources of our biological world, there is no real probability that such a sequence may be found in that way.

    And, of course, neither duplication nor recombination nor sexual reproduction can help. Those resources are part of the probabilisitc resource, part of the 140 bits. There is no way that they can really find the needed sequence.

    Because the needed sequence simply does not exist before. It is not there. So, no duplication or recombination or sexual crossing over have any superior chance to find it, because they are still random events in relation to the sequence to be found. Each of them is still one random variation attempt, and nothing more.

  30. 30
    Eugene S says:

    GPuccio

    Thank you very much for your prompt and detailed answers. I will give them a read offline.

    And a special thank-you for raising your voice to push for an index by author here on this blog.

    I really appreciate it.

  31. 31
    Eugene S says:

    GPuccio

    Thank you again. Your answers are an OP in their own right. Of course, it is not just information, it is functional information! Different assumptions and metrics lead to different conclusions. I have only two questions as of now.

    1. Just out of interest, could you sketch out how you got your estimate of 2^-430 of the probability that one random framework mutation produces the correct sequence.

    2. You mentioned that Durston’s method is a bit different. Could you give more details. I read the original paper but that was a long time ago. Coming back is always good for the reinforcement of one’s understanding.

    You generate your OPs at a rocket rate. I can’t keep up reading them. What’s more, they always instigate interesting discussions. I try to do my best to get through them as well.

    Many Thanks!

  32. 32
    gpuccio says:

    Eugene S:

    Thanks to you! 🙂

    1. Oh, that was my error. I forgot to say that I was assuming a specific sequence of 100 AAs, which implies a functional information (absolute, not derived from blast comparisons) of about 4.3 bits per AA, and therefore 430 bits in total. I will correct my previous comment.

    Please note that the absolute information value of one specific aminoacid is about 4.3 bits (log2 of 20), while the highest bitscore that you get from a blast comparison, for identity, is about 2.2 bits per aminoacid. That is due to how the blast algorithm works, and is evidence that my results, based on blst comparisosn, are probably a vast underestimate of the real functional information. Which is good, for me, because I prefer to be always on the safe side.

    2. The Durston method is different because he starts with a selected group of homolgue proteins in different species, then does a multiple alignment of all of them, and applies a definite computetion of the reduction of uncertainty for each aminoacid site, based on Shannon’s formula. IOWs, he is comparing the variance at each aminoacid site in a set of proteins restrained by a common function with the theoric variance for random sequences, where each AA site can be occuoied by each AA. He then sums the values for each site to get a globbal functional information value for that protein family.

    In that case, one absolutely conserved AA contributes as 4.3 bits of functional information, while a site which has a random variance of AAs contribute 0 bits, and then there are all possible intermediate situations.

    Durston’s method is very good, but it is much more difficult to apply: you have to choose your set of proteins, align them and do the computations for each site. It has its potential biases too, because of course the choice of the sequences, and the manual review of the alignment, are very important.

    In a sense, my method is easier to apply, and easier to verify by anyone. It just requires a correct use of the blast algorithm and of the available protein databases at the blast site.

    The choice of considering the best hit for each group of organism is very reliable, because at the levels of exponential improbabilities that are interesting for ID the simple existence of some high homology for more than 400 million years is an undeniable sign of extreme functional constraint. If there are no errors in the database, it’s impossible that such high exponential values of homology may be due to any random variance.

    The choice of using human proteins as probes to measure functional information has its definite reasons, as I have explained in my previous comments.

  33. 33
    jawa says:

    gpuccio,
    I’m glad you left the other thread and started this.
    Thanks.

  34. 34
    EugeneS says:

    GP

    “a specific sequence of 100 AAs”

    Yes, I would have thought that that was missing in the comment! Now everything clicks in.

    Thank you very much for clarifying the differences with Durston’s method.

  35. 35

    GP, I have now had a chance to read your OP a couple of times. Again, thank you for taking the time to write it. I think you have provided a fairly concise general overview of the process, and I really appreciated the extra links and graphics. I’m confident you’ve given interested UD readers a chance to become more familiar with transcription and regulation. It is nice to have so many aspects of the system covered in a single article, and I suspect that your readers will use this overview as a guide to seek more information. If your goal was to enable the opportunity to take (yet another) step in appreciating the vast complexity of such systems, then you’ve certainly hit your target. Bravo!

  36. 36
    gpuccio says:

    UB:

    Yes, that ws my goal indeed! Thank you. 🙂

    I thought that it was important to have a reasonably detailed overview of the whole regulation of transcription in one article. It’s indeed one irreducible complex network, and we can only stand in awe of its intrinsic complexity and beauty.

    And yes, it is definitely an invitation, to myself first of all, to deepen the knowledge and analysis of many different aspects of it.

    For example, I give here a brief list of a few topic that certainly deserve great attention, here in the discussion or in the future:

    1) The role of the Mediator complex as a hub that integrates different signals and filters them to the transcription machinery.

    2) The role of RNAs transcribed from the promoter and from the enhancer.

    3) The role of lncRNAs in transcription regulation.

    4) The histone code.

    5) The role of TADs as mega units of regulation.

  37. 37
    gpuccio says:

    To all:

    Let’s say something about TADs (topologically associating domains). The idea is that promoter-enhancer contacts and loops happen inside greater compartments (TAds), delimited by specific insulators. So, enhancers in a TAD will often interact eith promoters in the same TAD, while enhancer-promoter interaction between different TADs are possible, but rare.

    TADs seem to be relatively stable, but their boundaries, as much as their states (acitvated or inactivated) can change in different cell types.

    Here is a recent paper about TADs:

    TADs are 3D structural units of higher-order chromosome organization in Drosophila.

    Abstract:

    https://www.ncbi.nlm.nih.gov/pubmed/29503869

    Deciphering the rules of genome folding in the cell nucleus is essential to understand its functions. Recent chromosome conformation capture (Hi-C) studies have revealed that the genome is partitioned into topologically associating domains (TADs), which demarcate functional epigenetic domains defined by combinations of specific chromatin marks. However, whether TADs are true physical units in each cell nucleus or whether they reflect statistical frequencies of measured interactions within cell populations is unclear. Using a combination of Hi-C, three-dimensional (3D) fluorescent in situ hybridization, super-resolution microscopy, and polymer modeling, we provide an integrative view of chromatin folding in Drosophila. We observed that repressed TADs form a succession of discrete nanocompartments, interspersed by less condensed active regions. Single-cell analysis revealed a consistent TAD-based physical compartmentalization of the chromatin fiber, with some degree of heterogeneity in intra-TAD conformations and in cis and trans inter-TAD contact events. These results indicate that TADs are fundamental 3D genome units that engage in dynamic higher-order inter-TAD connections. This domain-based architecture is likely to play a major role in regulatory transactions during DNA-dependent processes.

    Another one:

    Principles of Chromosome Architecture Revealed by Hi-C.

    https://www.ncbi.nlm.nih.gov/pubmed/29685368

    Abstract
    Chromosomes are folded and compacted in interphase nuclei, but the molecular basis of this folding is poorly understood. Chromosome conformation capture methods, such as Hi-C, combine chemical crosslinking of chromatin with fragmentation, DNA ligation, and high-throughput DNA sequencing to detect neighboring loci genome-wide. Hi-C has revealed the segregation of chromatin into active and inactive compartments and the folding of DNA into self-associating domains and loops. Depletion of CTCF, cohesin, or cohesin-associated proteins was recently shown to affect the majority of domains and loops in a manner that is consistent with a model of DNA folding through extrusion of chromatin loops. Compartmentation was not dependent on CTCF or cohesin. Hi-C contact maps represent the superimposition of CTCF/cohesin-dependent and -independent folding states.

    And another one:

    Gene functioning and storage within a folded genome.

    https://www.ncbi.nlm.nih.gov/pubmed/28861108

    Abstract
    In mammals, genomic DNA that is roughly 2 m long is folded to fit the size of the cell nucleus that has a diameter of about 10 ?m. The folding of genomic DNA is mediated via assembly of DNA-protein complex, chromatin. In addition to the reduction of genomic DNA linear dimensions, the assembly of chromatin allows to discriminate and to mark active (transcribed) and repressed (non-transcribed) genes. Consequently, epigenetic regulation of gene expression occurs at the level of DNA packaging in chromatin. Taking into account the increasing attention of scientific community toward epigenetic systems of gene regulation, it is very important to understand how DNA folding in chromatin is related to gene activity. For many years the hierarchical model of DNA folding was the most popular. It was assumed that nucleosome fiber (10-nm fiber) is folded into 30-nm fiber and further on into chromatin loops attached to a nuclear/chromosome scaffold. Recent studies have demonstrated that there is much less regularity in chromatin folding within the cell nucleus. The very existence of 30-nm chromatin fibers in living cells was questioned. On the other hand, it was found that chromosomes are partitioned into self-interacting spatial domains that restrict the area of enhancers action. Thus, TADs can be considered as structural-functional domains of the chromosomes. Here we discuss the modern view of DNA packaging within the cell nucleus in relation to the regulation of gene expression. Special attention is paid to the possible mechanisms of the chromatin fiber self-assembly into TADs. We discuss the model postulating that partitioning of the chromosome into TADs is determined by the distribution of active and inactive chromatin segments along the chromosome. This article was specially invited by the editors and represents work by leading researchers.

    Fig. 1 is simple and nice, showing the hyerarchy between:

    – Chromosomal territories

    – A and B compartments

    – TADs

    – Chromatin loops

    Other statements from that paper:

    Additional line of evidence supporting the idea that TADs represent structural and functional units of the genome arises from the studies of cell differentiation and reprogramming. In the model system of ESC differentiation into several distinct lineages, TADs were found to be largely stable along the genome, but demonstrated a high flexibility in both inter- and intra-TAD interactions [75]. TADs containing upregulated genes exhibit a substantial increase in chromatin interactions and relocate into A-compartment, whereas TADs harboring downregulated genes tend to decrease a number of chromatin contacts and undergo A-to-B compartment switching.

    While, in Drosophila, the primary function of TADs appears to be the storage of inactive genes [44], mammalian TADs acquire additional function in transcriptional control [118]. Although stochastic interactions of neighboring nucleosomes are likely to contribute also in the assembly of mammalian TADs, the insulator protein CTCF plays an essential role in the spatial and functional separation of these TADs. It has been suggested that chromatin loop extrusion plays an essential role in the formation of mammalian TADs [115, 116]. However, the nature of extrusion machines remains elusive and the model still lacks direct experimental proves. Mammalian TADs have a complex structure and are likely to be assembled from smaller looped and ordinary domains [46]. The relation of these nested domains to the functional organization of the genome remains to be studied.

    Interesting perspectives indeed! 🙂

  38. 38
    jawa says:

    #37 is a fascinating topic. Another article by itself.

  39. 39
    jawa says:

    It’s interesting to see how this excellent OP and the additional information posted in gpuccio’s comments attract so few commenters. The issues described here are a fundamental part of the current revolution in biology, which is going to take many people by surprise, when it finally gets noticed by the mainstream media.
    Note that it took quite a long time for Nicolaus Copernicus’ brilliant discoveries to be published and much longer to get accepted.
    The same is happening now in biology, which has become the actual queen of science, using heavy math, physics, chemistry, bioinformatics, modeling, electronics to understand the complex functionality seen in research.
    The medical field depends on the advance in biology research.
    The whole society benefits from it too.
    Amazing discoveries in the near future will surprise many folks out there.
    Just wait and see.
    In the meantime, our appreciation to gpuccio for his strong dedication to studying these fascinating topics of biology and for sharing with the rest of us what he has learned.

  40. 40
    PeterA says:

    I’m still processing the abundant information gpuccio shared here.
    I may have a few questions about the OP, but will have to wait till I find more time to pose them clearly.

  41. 41
    PeterA says:

    jawa,
    Agree with your commentary.
    However, regarding the statement “when it finally gets noticed by the mainstream media.” I would rather say that many biologists could be taken by surprise too. It will be interesting to see their reaction.

  42. 42
    gpuccio says:

    jawa, PeterA:

    Good toughts! 🙂

    One of the problems, IMO, is that the current ideology in science and biology is to ignore intentional function, design, and in general teleology. At all costs.

    It must be difficult, even for professional biologists, to really appreciate the multi-layered beauty of biological engineering and at the same be forced to deny the functional depth and the wonderful richness of thought that pervade that engineering in all its parts.

    You know, only for a limited number of times can one pretend to be amazed at the unending cleverness of unguided evolution, and really keep one’s intellectual, cognitive and moral integrity. Defending what is false, at all costs, has a definite price, even for the best and most intelligent people.

  43. 43
    PaoloV says:

    I like jawa’s analogy to Copernicus. The old stablishment is deeply entrenched in their archaic ideas that will fall like a house of cards.

  44. 44
    jawa says:

    gpuccio,

    Are the histones also product of the DNA – transcription – mRNA – translation process?

    Are they part of the transcription mechanism?

    If they require transcription but they are part of the transcription mechanism, how do the first histone get produced?

    Maybe i’m missing something.

    Thanks.

  45. 45
    gpuccio says:

    jawa:

    Of course histones are proteins, so they are the product of the transcription and translation machineries.

    Like all other proteins, including those necessary for transcription, starting from the absolutely essential DNA-directed RNA polymerase and many others, and those necessary for translation, including the 20 aaRNA synthetases, the about 80 ribosomal proteins (in eukaryotes) ans many others.

    So, if you ask:

    If proteins require transcription but they are part of the transcription mechanism, how does the first protein get produced?

    you are definitely missing something, the same thing that we are all missing: how all the complex machinery that allows life to exist was generated.

    Of course the only possible answer is that those things were designed, and that they were designed according to a general plan. However, even from a design point of view, it remains IMO a beautifulk mystery.

    Regarding transcription, in particular, we must consider that my OP is essentially about transcription regulation in eukaryotes. In prokaryotes, transcription regulation is rather different, and it is certainly simpler than in eukaryotes, but not simple at all! I have given only a few hints about the differences between prokaryotes and eukaryotes in the OP, because my purpose was to discuss eukaryotic transcription.

    The simple fact is that eukaryotic transcription has a lot of new and unprecedetned layers of implementation and regulation, even if it certainly reuses many features that are alredy there in prokaryotes. The role of histones, of the Mediator complex, of chromatin and nuclear organization, are just a few important examplse of eukaryotic novelties.

    I would say that in the amazing history of life on our planet each single complex event, even the generation of a single new protein, is a wonderful example of design and engineering. But there are certainly a few major steps where the level of design innovation is almost undescribable. They are (at least):

    a) Origin of life

    b) The appearance of eukaryotes

    c) The appearance of metazoa

    d) The explosions of different organisms, body plans and phyla in metazoa, in particular the Ediacaran and Cambrian explosions

    And, of course, there are many others (the explosion of flower plants, the transition to vertebrates, and so on).

    The biggest mystery of all, probably, is that so many people are still convinced that the neo-darwinist theory can be a good explanation for those major events, while it can not even explain the appearance of a single new complex functional protein.

  46. 46
    gpuccio says:

    To all:

    A few further thoughts about enhancers:

    1) The most recent estimates of their number are now in the range of more than one million, or even millions. The simple truth is: nobody really knows how many of them can be found, for example, in the human genome.

    2) According to the most recent estimates, they can well represent about 12% of the whole human genome: see Table 1 from the following paper:

    GeneHancer: genome-wide integration of enhancers and target genes in GeneCards

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5467550/

    Line 4 of the table, “All sources combined” gives the following numbers:

    Number of elements. 434 139

    Mean length: 1233 bp

    % of the whole genome: 12.4%

    So much for junk DNA!

    So, enhancer DNA would be more than 8 times more than protein coding DNA, at least.

    But if enhancers were really one million, or even more, as many believe, that figure could go up to 25% of the whole genome or more.

    3) Enhancers apparently form some higher association structures, regions where many enhancers are present and that could represent specially important nodes of transcritpion regulation. Some refer to these as “super enhancers”.

    4) Great progress is neing made in techniques that can image enhancer-promoter activity and therefore £D chromatin topology dynamically, in space and time: we can expect amny new important discoveries from that kind of research. Here are a couple of very recent examples:

    Enhancer functions in three dimensions: beyond the flat world perspective

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5981187/

    Abstract:

    Transcriptional enhancers constitute a subclass of regulatory elements that facilitate transcription. Such regions are generally organized by short stretches of DNA enriched in transcription factor-binding sites but also can include very large regions containing clusters of enhancers, termed super-enhancers. These regions increase the probability or the rate (or both) of transcription generally in cis and sometimes over very long distances by altering chromatin states and the activity of Pol II machinery at promoters. Although enhancers were discovered almost four decades ago, their inner workings remain enigmatic. One important opening into the underlying principle has been provided by observations that enhancers make physical contacts with their target promoters to facilitate the loading of the RNA polymerase complex. However, very little is known about how such chromatin loops are regulated and how they govern transcription in the three-dimensional context of the nuclear architecture. Here, we present current themes of how enhancers may boost gene expression in three dimensions and we identify currently unresolved key questions.

    and:

    Dynamic interplay between enhancer–promoter topology and gene activity

    https://www.nature.com/articles/s41588-018-0175-z

    Abstract:

    A long-standing question in gene regulation is how remote enhancers communicate with their target promoters, and specifically how chromatin topology dynamically relates to gene activation. Here, we combine genome editing and multi-color live imaging to simultaneously visualize physical enhancer–promoter interaction and transcription at the single-cell level in Drosophila embryos. By examining transcriptional activation of a reporter by the endogenous even-skipped enhancers, which are located 150?kb away, we identify three distinct topological conformation states and measure their transition kinetics. We show that sustained proximity of the enhancer to its target is required for activation. Transcription in turn affects the three-dimensional topology as it enhances the temporal stability of the proximal conformation and is associated with further spatial compaction. Furthermore, the facilitated long-range activation results in transcriptional competition at the locus, causing corresponding developmental defects. Our approach offers quantitative insight into the spatial and temporal determinants of long-range gene regulation and their implications for cellular fates.

  47. 47
    DATCG says:

    Gpuccio, Lovely work 🙂

    I’m to busy to post much, but lurking. And the Chromatin take down was excellent 😉

    If I have time, may ask questions or add over the weekend once I have time to digest your full post!

    ex-junk! Hmmm … yep… that’s a keeper! 🙂

    former junk, formerly thought to be junk, surprise, this is not junk, DNA Junk found to have function Junk! 😉

    Hahaha…. oh my!

    When will blind, unguided Darwinist run out of room Junk for their theory to work Junk?

    Have fun guys! This will be fun reading.

  48. 48
    PeterA says:

    #45:

    “The biggest mystery of all, probably, is that so many people are still convinced that the neo-darwinist theory can be a good explanation for those major events, while it can not even explain the appearance of a single new complex functional protein.”

    This made me laugh unstoppably. How funny, though a sad reality at the same time. Well written.

    Thanks.

  49. 49
    gpuccio says:

    DATCG:

    Welcome, I was missing you! 🙂

    I am sure you will contribute brilliantly as always. There is no rush, take your time. 🙂

  50. 50
    gpuccio says:

    PeterA:

    Thanks to you. I think that really came from my heart!

  51. 51
    PaoloV says:

    Peter,
    Don’t laugh. There’s no such a mystery anymore. Ask George Castillo in the chromatin thread to reveal it. Apparently he knows how the eukaryote histones evolved from their bacteria ancestor proteins. Quite simple.
    He made gpuccio run for the hills.
    🙂

  52. 52
    jawa says:

    Paolo,
    Keep those silly jokes off this serious discussion.
    If George Castillo knows so much, why did he avoid answering gpuccio’s and UB’s questions?
    Wake up and smell the coffee!
    You may want to study a basic biology 101 before commenting here.

  53. 53

    George’s performance on the chromatin thread is a non-starter. He pretends to himself that he has somehow penetrated the thrust of the GP’s argument, when in fact he hasn’t even made a dent. Anyone who has followed GP’s argument knows very well that descent is not an issue with GP. It never has been. He’s made the argument for descent himself many times. The issue is the mechanism involved, and to that George has zilch.

    And the reason he won’t answer my question is because a) logic, empirical evidence, and history are not in his favor, and b) the comfortable vagueries of materialism must be protected at all costs. Answering that question in earnest is strategic suicide. So he replaces the answer with insults and puffery; the intellectual equivalent of whistling past the graveyard.

  54. 54
    gpuccio says:

    UB:

    I think you are absolutely right.

    Indeed, George Castillo has stated more than once that I don’t want to accept his “evidence” for a bacterial origin of eukaryotic histones because it goes against my personal opinions.

    That’s really strange, because I have always accepted without any difficulty the evidence he quoted for an archaeal origin of the eukaryotic histone fold.

    So, why should a bacterial origin be against my personal opinions, while I am glad to accept an archaeal origin? Am I so partial to archaea? What have bacteria done to me personally? (OK, some fever here and there, I suppose 🙂 )

    The simple truth is that I am convinced by the evidence he quoted for archaea, while I consider, at best, very inconclusive the evidence he quoted for bacteria.

    One thing that I find really depressing is having to discuss with someone who is not interested at all in truth, or even in other’s ideas, and considers everything only as a personal fight for some not well defined agenda.

    Better to just avoid that kind of people.

  55. 55

    Reading some of the papers listed, leading to papers not listed, I am struck by the use of the word “mark”.

    With discrete objects described as “marks”, we can understand the functioning of these systems. Without them, we would only measure and describe the dynamics of the system (using language and descriptions we already have) but we would understand nothing else.

    How appropriate that an abstract concept is at the very center of our descriptions and understandings; an organizational utility used to specify something among alternatives, a control.

    It’s interesting that the parts of system that must be recorded in our descriptions (in order for those descriptions to be useful to us) are the teleological and the irreducible. Materialists just can’t catch a break.

  56. 56
    PaoloV says:

    UB,
    Very interesting observation.
    Thanks.

  57. 57
    gpuccio says:

    UB:

    You really make a great point at #55! 🙂

    You have caught a profound concept, which is probably related to the central idea of consciousness and its properties.

    Indeed, an engineered algorithm is something that we understand, not just a series of connections between steps. That’s the difference netween artificial intelligence and intelligence, where “artificial” in the end stays for “not really true”.

    We write progarms using programming languages, and not directly machine code, for the same reason: we need to understand what is happening. Semiosis, the ability to project abstract thinking and intuitions into material events, is what makes us humans. And gives us power that no other material process in the universe seems to have, including the power to generate complex functional information.

    That’s why ID is such an important worldview: it’s not only the best explanation for biological realities, and probably for the universe itself; it’s a key to understanding the deeper levels of reality that are hidden behind everything we experience.

    If neo-darwinists are happy with describing things without understanding them, we certainly beg to differ. And, in the end, even those who do not recognize understanding as a precious and unique experience are forced to use understanding words and constructs to just communicate what they believe.

  58. 58
    gpuccio says:

    To all:

    About super-enhancers.

    This has just come out on Pubmed (August 31, 2018).

    Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers

    https://www.tandfonline.com/doi/abs/10.1080/15592294.2018.1514231

    Abstract:

    Super-enhancers and stretch enhancers represent classes of transcriptional enhancers that have been shown to control the expression of cell identity genes and carry disease- and trait-associated variants. Specifically, super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors, chromatin regulators, or chromatin marks, while stretch enhancers are large chromatin-defined regulatory regions of at least 3,000 base pairs. Several studies have characterized these regulatory regions in numerous cell types and tissues to decipher their functional importance. However, the differences and similarities between these regulatory regions have not been fully assessed. We integrated genomic, epigenomic, and transcriptomic data from ten human cell types to perform a comparative analysis of super and stretch enhancers with respect to their chromatin profiles, cell type-specificity, and ability to control gene expression. We found that stretch enhancers are more abundant, more distal to transcription start sites, cover twice as much the genome, and are significantly less conserved than super-enhancers. In contrast, super-enhancers are significantly more enriched for active chromatin marks and cohesin complex, and more transcriptionally active than stretch enhancers. Importantly, a vast majority of super-enhancers (85%) overlap with only a small subset of stretch enhancers (13%), which are enriched for cell type-specific biological functions, and control cell identity genes. These results suggest that super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers, and importantly, most of the stretch enhancers that are distinct from super-enhancers do not show an association with cell identity genes, are less active, and more likely to be poised enhancers.

    A pre-version of the full paper is available at biorxiv, here:

    https://www.biorxiv.org/content/biorxiv/early/2018/04/30/310839.full.pdf

    Let’s try to understand.

    It seems that in the last few years two special classes of enhancers have been independently defined and studied by different groups of researchers:

    1) Super-enhancers are “defined based on their enrichment for binding of key master regulator TFs, Mediator, and chromatin regulators. These cluster of enhancers are cell type-specific, control the expression of cell-identity genes, are sensitive to perturbation, associated with disease, and boost the processing of primary microRNA into precursors of microRNAs”.

    2) Stretch-enhancers, instead, are large genomic regions with enhancer characteristic and defined based on their size (>3kb).

    So, the concept of super-enhancer depends essentially on enrichment of the binding of Master reguklators, while the concept of strecth-enhancer is based only on the length of the enhancer region.

    The point is that many think that the two concepts are overlapping. Both these special classes of enhancers seem to be cell type-specific and related to the control of cell identity genes.

    The quoted paper, instead, finds important difference bewteen the two classes. They considered existing databses of known sequences already independently defined as super-enhancers or stratch-enhancers, and analyzed those sequences in 10 human cell lines.

    The results are very interesting, and you can see them detailed in Fig. 1 of the paper.

    In brief they found, in those 10 cell lines:

    a) An average of 745 superenhancers with mean size 22,812 bp

    b) An average of 11,160 stretch enhancers with mean size 5,060 bp

    c) So, super-enhancers seem to be significanlty longer than stretch-enhancers, but they are much less numerous. See Fig. 1, which details the number of the two types of regions in each cell line.

    c) Fig. 1 b shows that, in each cell line, the two categories cover a different fraction of the genome, always greater for stretch-enhancers, with maximum values of about 5% for stretch-enhancers and about 1% for super-enhancers.

    d) Super-enhancers are usually nearer to the promoter and the TSS (Fig. 1 c).

    e) Super-enhancers are much more evolutionary conserved (Fig. 1 d).

    f) Super-enhancers are highly enriched for active chromatin marks, while strecth-enhancers are highly enriched for poised chromatin mark (Fig. 2, a and b).

    g) Super-enhancers are significantly more active and located in open regions than stretch enhancers, which are more likely to be poised.

    h) Super-enhancers are enriched with cohesin and CTCF binding, a sign of active loop formation (Fig. 3, a and b).

    i) Super-enhancers are transcriptionally more active than stretch enhancers, as shown by RNA Pol II bindign and other markers (Fig. 4, a,b).

    j) Super-enhancers are transcriptionally more active than stretch enhancers, IOWs generate more eRNA (Fig. 4, c,d).

    k) Stretch enhancers are less cell-type-specific than super-enhancers.

    l) While the two classes are definitely distinct, there is some overlap which has definite features: a vast majority of super-enhancers (85%) overlap with only a small number of stretch enhancers (13%), and the overlapping regions (super-stretch enhancers) are definitely smaller in size (Fig. 5, a,b,c).

    m) These special overlapping regions (super-stretch enhancers) are cell-type-specific and control key cell identity genes.

    n) In general, enhancers are more likely to be cell-type-specific, transcriptionally active, and frequently
    interacting when found in clusters at the genomic scale, whatever their sizes.

    I think these things are very interesting. A whole new level of detail is rapidly unfolding.

  59. 59

    Junk DNA. Right.

    All that transcription, ain’t no big deal.

  60. 60
    EugeneS says:

    UB #55

    Absolutely! Shallit and other romantic defenders of all good from all bad cannot do anything of substance with the fact that, apart from living organisms, in all known universe signalling systems are observed only in correlates of intelligence.

    I have seen different tactics of dissenters that they employ against ID ranging from panpsychism to a complete dismissal of abductive reasoning as ‘a simulacra, an ideosyncrasy of Chalse Peirce’.

    One of them, remarkably, said, in an attempt to debunk ID, that all manners of things must have happened and did happen, but we now see only what survived. He asked me, why does a photon from a distant star get right into my retina? Mind you, the only problem is to demonstrate the ease with which life originates…

    What a disgrace!

    And, of all people, these then claim that they are standing for science against obscurantism.

  61. 61
    EugeneS says:

    Charles Peirce, of course…

  62. 62
    gpuccio says:

    EugeneS:

    The photon retina nonsense that you quote seems to be a creative variant of the old and infamous “deck of cards argument” which goes more or less:

    “When you draw the 52 cards in some specific random order, that result is extremely unlikely (probability = 8.065818e-67). But it has happened! Therefore, extremely unlikely events happen all the time.”

    That is of course full evidence, for the fans of the argument, that ID is doomed.

    I think that someone raised again a similar argument recently, in a discussion here, but I cannot remember who.

    Suffice it to say that this kind of reasoning is one of the best examples of the depths of inanity that can be reached by the human mind.

  63. 63
    gpuccio says:

    To all:

    What happens at super-enhancer regions?

    Very complex things, it seems. Involving not only the expected biochemical reactions and protein protein interactions, but also intrinsically disordered regions (IDRs), interesting phase separations, and so on.

    See here:

    Coactivator condensation at super-enhancers links phase separation and gene control.

    https://www.ncbi.nlm.nih.gov/pubmed/29930091

    Abstract:

    Super-enhancers (SEs) are clusters of enhancers that cooperatively assemble a high density of the transcriptional apparatus to drive robust expression of genes with prominent roles in cell identity. Here we demonstrate that the SE-enriched transcriptional coactivators BRD4 and MED1 form nuclear puncta at SEs that exhibit properties of liquid-like condensates and are disrupted by chemicals that perturb condensates. The intrinsically disordered regions (IDRs) of BRD4 and MED1 can form phase-separated droplets, and MED1-IDR droplets can compartmentalize and concentrate the transcription apparatus from nuclear extracts. These results support the idea that coactivators form phase-separated condensates at SEs that compartmentalize and concentrate the transcription apparatus, suggest a role for coactivator IDRs in this process, and offer insights into mechanisms involved in the control of key cell-identity genes.

    IDRs are a favourite of our small group of commenters, epsecially DATCG and me! 🙂

  64. 64
    gpuccio says:

    To all:

    This is very recent and definitely very much in favor of the central role of intrinsically disordered regions (IDRs) and intrinsically disordered proteins (IDPs):

    The evolutionary origins of cell type diversification and the role of intrinsically disordered proteins.

    https://www.ncbi.nlm.nih.gov/pubmed/29394379

    Abstract
    The evolution of complex multicellular life forms occurred multiple times and was attended by cell type specialization. We review seven lines of evidence indicating that intrinsically disordered/ductile proteins (IDPs) played a significant role in the evolution of multicellularity and cell type specification:

    (i) most eukaryotic transcription factors (TFs) and multifunctional enzymes contain disproportionately long IDP sequences (?30 residues in length), whereas highly conserved enzymes are normally IDP region poor;

    (ii) ~80% of the proteome involved in development are IDPs;

    (iii) the majority of proteins undergoing alternative splicing (AS) of pre-mRNA contain significant IDP regions;

    (iv) proteins encoded by DNA regions flanking crossing-over ‘hot spots’ are significantly enriched in IDP regions;

    (v) IDP regions are disproportionately subject to combinatorial post-translational modifications (PTMs) as well as AS;

    (vi) proteins involved in transcription and RNA processing are enriched in IDP regions; and

    (vii) a strong positive correlation exists between the number of different cell types and the IDP proteome fraction across a broad spectrum of uni- and multicellular algae, plants, and animals.

    We argue that the multifunctionalities conferred by IDPs and the disproportionate involvement of IDPs with AS and PTMs provided a IDP-AS-PTM ‘motif’ that significantly contributed to the evolution of multicellularity in all major eukaryotic lineages.

    Emphasis mine, just to show the connection to this thread.

    Note the “disordered/ductile” double meaning proposed for the “D”! 🙂

  65. 65
    George Castillo says:

    I will agree that answering your question is strategic suicide, upright, because

    1. you have carefully designed the question in a way that will force most people (read: people with little knowledge of biology) to eventually pidgeonhole themselves into saying that the first aaRs was “made from memory” after some number of other aaRs were somehow made not from memory (by chance? I guess is the alternative?)

    2. the entire premise of your question is a strawman as you are unflinchingly rigid in the definition of an aaRs, its functions/roles, and the system itself, but the conversation is in fact about the evolution of the system which occured millenia ago. Your question is not representative of how anyone thinks this system evolved.

    3. no one knows how the translation system evolved; how information was first encoded in a genome and how that genome was converted into a functional molecule. It is an incredibly difficult question to ask and to try to answer. But saying the evolution of this process is so complex, or that transcription is so complex and therefore they must have been designed, is not the answer. Invoking some designer when the road gets tough might make it easy for you to sleep at night, but if everyone did that, we’d still be banging stones together to cook our dinner.

  66. 66
    jawa says:

    George Castillo,

    you’re completely off target, buddy.

    the issue is not about how much we know or don’t know, it’s that such evolutionary mechanism in the case we are discussing is purely imaginary, it doesn’t exist at all.

    we’re not talking about complexity, we’re talking about functional complexity that definitely has been designed

    you won’t find a non-design way to get that, no matter how much time you give it.

    time to wake up and smell the flowers in the garden

    just see the trend… every new research discovery points to design

    the inexorable march of the design revolution is going to take many people by surprise, but then we will tell them “I told you so”

    the poisonous pseudo-scientific hogwash should be removed from science-related publications

    As Tom Hanks said in the movie “Sully”:

    “can we get serious now?”

  67. 67
    PeterA says:

    George Castillo,

    do you have any argument against what has been presented in this thread?

    go ahead, tell us what it is

  68. 68
    jawa says:

    George Castillo,

    as somebody said in this website before, it’s not what we don’t know, but what we do know, that points unambiguously to intelligent design… and tomorrow we shall know more

    have a cup of tea and listen to this

  69. 69
    gpuccio says:

    George Castillo:

    However difficult it is to say it, welcome here. 🙂

    I agree only with one of the things that you have said: it is easy for me to sleep at night.

  70. 70
    gpuccio says:

    To all:

    So, now we have had our:

    “There is nothing in all these things that neo-darwinism can’t explain”.

    We are not alone any more. Hooray! 🙂

  71. 71
    George Castillo says:

    “do you have any argument against what has been presented in this thread?”

    I don’t really see anything besides a high school-level regurgitation of some wikipedia pages with a smattering of some copy-pasting from a handful of research articles.

    What do you think I should have an argument against?

  72. 72
    EugeneS says:

    GP

    Exactly!

    The agument against this nonsense is an a priori specification!

    The ‘Five of a kind’ pattern repeated N times in a row admits a short description, whereas ‘retina in the eye’=’random hand’ does not.

  73. 73
    EugeneS says:

    George,

    If you don’t mind my interference, the existence of code translation is a prerequisite of biological evolution, not the other way around. For evolution to even kick-start, a system MUST be self-replicating, open-ended and semantically closed.

    “No one knows how the translation system evolved”.

    This is already wrong! It did not! You, guys, are painting yourselves in the corner by a priori dismissing design. Hysteretic graduality is not an answer.

    It must have been front-loaded, not evolved. Just like we humans front-load programs on to a computer to execute. There is no such thing as “algorithmic causality” in nature. Any pair {code,interpreter} in nature points to intelligence.

    The price you, guys, pay for dismissing the awkward questions is dismissing evidence.

  74. 74
    gpuccio says:

    To all:

    This is an even more detailed and irrefutable form of the argument:

    “There is nothing in all this high school-level regurgitation of some wikipedia pages with a smattering of some copy-pasting from a handful of research articles that neo-darwinism can’t explain”.

  75. 75
    gpuccio says:

    EugeneS:

    Of course, the probability of having some sequence when we draw 52 cards is:

    1 = necessity

    That’s how “improbable” that result is!

    Of course, the probability of having some specific sequence, declared in advance in its contingent detail, is 8.065818e-67.

    And the probability of having some ordered sequence depends on how we define “ordered” and is certainly higher than the probability of one single sequence, depending on how big a target space is defined by our defintition of order. However, in most cases, the target space will be extremely smaller than the search space, and the probability will be extremely low. The result is that well ordered sequences will never be observed.

  76. 76
    gpuccio says:

    To all:

    A few useful hints to produce really scientific posts:

    a) Never look at Wikipedia

    b) Never go to high school

    c) Never copy and paste anything: if possible, copy everything manually

    d) Strictly avoid research articles, especially handfuls of them. It is probably admissible to refer to one, or to thousands at a time, but never handfuls.

    Courtesy of George Castillo.

  77. 77
    George Castillo says:

    To all:

    A few useful hints to produce really scientific posts:

    a) Whoa! Look how complex this is!

    b) It must be designed!

    Courtesy of UD et al.

  78. 78
    gpuccio says:

    To all:

    This is, again, about chromatin topology:

    Genomic meta-analysis of the interplay between 3D chromatin organization and gene expression programs under basal and stress conditions

    https://epigeneticsandchromatin.biomedcentral.com/articles/10.1186/s13072-018-0220-2

    Abstract:

    Background
    Our appreciation of the critical role of the genome’s 3D organization in gene regulation is steadily increasing. Recent 3C-based deep sequencing techniques elucidated a hierarchy of structures that underlie the spatial organization of the genome in the nucleus. At the top of this hierarchical organization are chromosomal territories and the megabase-scale A/B compartments that correlate with transcriptional activity within cells. Below them are the relatively cell-type-invariant topologically associated domains (TADs), characterized by high frequency of physical contacts between loci within the same TAD, and are assumed to function as regulatory units. Within TADs, chromatin loops bring enhancers and target promoters to close spatial proximity. Yet, we still have only rudimentary understanding how differences in chromatin organization between different cell types affect cell-type-specific gene expression programs that are executed under basal and challenged conditions.

    Results
    Here, we carried out a large-scale meta-analysis that integrated Hi–C data from thirteen different cell lines and dozens of ChIP-seq and RNA-seq datasets measured on these cells, either under basal conditions or after treatment. Pairwise comparisons between cell lines demonstrate a strong association between modulation of A/B compartmentalization, differential gene expression and transcription factor (TF) binding events. Furthermore, integrating the analysis of transcriptomes of different cell lines in response to various challenges, we show that A/B compartmentalization of cells under basal conditions significantly correlates not only with gene expression programs and TF binding profiles that are active under the basal condition but also with those induced in response to treatment. Yet, in pairwise comparisons between different cell lines, we find that a large portion of differential TF binding and gene induction events occur in genomic loci assigned to A compartment in both cell types, underscoring the role of additional critical factors in determining cell-type-specific transcriptional programs.

    Conclusions
    Our results further indicate the role of dynamic genome organization in regulation of differential gene expression between different cell types and the impact of intra-TAD enhancer–promoter interactions that are established under basal conditions on both the basal and treatment-induced gene expression programs.

    Here they are considering one of the highest levels of topology, the A-B compartments, in different cell lines, both at basal conditions and in response to various treatments.

    There are many interesting things here, but for the moment I would point to the following:

    a) Table 1, that shows thatin each cell line there are about 14000 – 15000 protein coding genes assigned to compartment A and 4000 – 5000 assigned to compartment B.

    b) Fig.1, that shows that compartment B is always transcriptionally less active than compartment A in all cell lines (but certainly not inactive)

    c) Fig. 2, that shows how two cell lines, when compared, show definite differences in gene assignment to compartments A and B, and therefore in gene expression.

    d) Table 2, that shows how differences in A-B assignment between the same two cell types correspond to definite differences in epigenetic markers.

    e) Table 3, that shows how specific cell treatmens are associated to specific variations in assignments to A-B compartments.

  79. 79
    gpuccio says:

    George Castillo:

    Have you finally understood that we in ID infer design by evaluating the functional complexity of biological objects?

    My compliments.

  80. 80
    bill cole says:

    Hi gpuccio

    Thank you so much for posting this. I look forward to reviewing it in detail. I think you have continued to strengthen the case the eukaryotic cell was a unique origin event as was the first multicellular life.

  81. 81
    gpuccio says:

    bill cole:

    Hi Bill, glad to hear from you! 🙂

    Yes, OOL, eukaryotes and metazoa are really the three biggest steps in the fascinating history of life engineering.

  82. 82
    gpuccio says:

    To all:

    One of the big unsolved problems is how specific TFs bind enhancers, and how the specificity of that interaction draws the amazinf specificity of the global interaction between 1 million enhancers, tens of thousands of promoters, and tens of tousand of genes, both protein coding and non protein coding.

    This recent paper gives some interesting hints:

    Intrinsic DNA Shape Accounts for Affinity Differences between Hox-Cofactor Binding Sites.

    https://www.ncbi.nlm.nih.gov/pubmed/30157419

    Abstract:

    Transcription factors bind to their binding sites over a wide range of affinities, yet how differences in affinity are encoded in DNA sequences is not well understood. Here, we report X-ray crystal structures of four heterodimers of the Hox protein AbdominalB bound with its cofactor Extradenticle to four target DNA molecules that differ in affinity by up to ?20-fold. Remarkably, despite large differences in affinity, the overall structures are very similar in all four complexes. In contrast, the predicted shapes of the DNA binding sites (i.e., the intrinsic DNA shape) in the absence of bound protein are strikingly different from each other and correlate with affinity: binding sites that must change conformations upon protein binding have lower affinities than binding sites that have more optimal conformations prior to binding. Together, these observations suggest that intrinsic differences in DNA shape provide a robust mechanism for modulating affinity without affecting other protein-DNA interactions.

    So, TFs can bind to different DNA sequences with different 3D shapes, but with similar final conformations. The difference between the basic conformation of the dNA sequence and the final conformation of the bound complex translates into lower affinity. It is amazing to think how that kind of structural information can have a role in modulating the global, complex and highly functional results of transcription.

    The point is: there are many different ways that functional information about the procedures to be implemented seems to be written in DNA, and particularly in enhancers: sequence, position, spatial relationship to promoters and to other enhancers, length, structure. And may be something else that we still have to understand.

    Multiply that by 1 million individual nodes of functional information…

  83. 83
    bill cole says:

    Hi George

    3. no one knows how the translation system evolved; how information was first encoded in a genome and how that genome was converted into a functional molecule. It is an incredibly difficult question to ask and to try to answer. But saying the evolution of this process is so complex, or that transcription is so complex and therefore they must have been designed, is not the answer. Invoking some designer when the road gets tough might make it easy for you to sleep at night, but if everyone did that, we’d still be banging stones together to cook our dinner.

    As a minimum the design argument has put the run away speculation of Neo-Darwinism in check. If you read many of the papers in the scientific literature that assume that Neo-Darwinism or universal common descent by mutational modification is true as a working hypothesis you can see how this unsupported set of assertions has mislead science.

  84. 84
    ET says:

    George Castillo:

    I don’t really see anything besides a high school-level regurgitation of some wikipedia pages with a smattering of some copy-pasting from a handful of research articles.

    Well given your kindergarten level of understanding I can understand what was posted is way over your head, so much so it has you all upset.

    The design inference is more than the existence of mere complexity. And it still stands that any given design inference can be easily refuted if someone could demonstrate non-telic processes can produce it.

    So, instead of being in denial why don’t you just step up and do something beyond your hand-wave and flailing? Or does that help you sleep better at night?

  85. 85
    ET says:

    George Castillo:

    3. no one knows how the translation system evolved; how information was first encoded in a genome and how that genome was converted into a functional molecule. It is an incredibly difficult question to ask and to try to answer. But saying the evolution of this process is so complex, or that transcription is so complex and therefore they must have been designed, is not the answer. Invoking some designer when the road gets tough might make it easy for you to sleep at night, but if everyone did that, we’d still be banging stones together to cook our dinner.

    Nice equivocation. Intelligent Design is NOT anti-evolution. The point being is no one even has a clue as to how blind and mindless processes could produce such a thing. And yet we have ample evidence of intelligent agencies doing so.

    Science 101 therefore mandates the design inference and all it entails.

  86. 86
    bill cole says:

    Hi George

    . no one knows how the translation system evolved;

    -or how DNA evolved
    -or how metabolic systems evolved
    -or how the spliceosome evolved
    -or how the ubiquitin system evolved
    -or how the nuclear pore complex evolved
    -or how exons and introns evolved
    We can go on for ever but one question remains unanswered. With all this uncertainty how is it some people still claim that life is the result of evolution?

  87. 87
    gpuccio says:

    bill cole:

    Good points. 🙂

    “No one knows” and yet we have to believe that everybody knows.

    At the same time, everybody knows how functionally complex things originate: they are always designed by conscious intelligent beings. Everybody knows, and yet we have to believe that such universal knowledge must never be applied to biological objects.

    It’s a really strange world we live in.

  88. 88
    ET says:

    George-

    Intelligent Design operates via Four Rules of Scientific Reasoning from Principia Mathematica– as science should:

    Rule 1 We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.

    Rule 2 Therefore to the same natural effects we must, as far as possible, assign the same causes.

    Rule 3 The qualities of bodies, which admit neither intensification nor remission of degrees, and which are found to belong to all bodies within the reach of our experiments, are to be esteemed the universal qualities of all bodies whatsoever.

    Rule 4 In experimental philosophy we are to look upon propositions inferred by general induction from phenomena as accurately or very nearly true, not withstanding any contrary hypothesis that may be imagined, till such time as other phenomena occur, by which they may either be made more accurate, or liable to exceptions.

    Following those rules provides ID with a means of being falsified. It also tells us to use and trust our knowledge of cause and effect relationships. Science 101

    In short, we infer these things were intelligently designed because everything we know says that they were.

  89. 89
    bill cole says:

    Hi George

    But saying the evolution of this process is so complex, or that transcription is so complex and therefore they must have been designed,

    This is not the design argument. The design argument states that the only known source of sustainable amounts of functional information is conscious intelligence. We know this mechanism works so we infer it as the cause of genetic information. Couscous intelligent beings are capable of designing functional sequences.

    Functional sequences are partially responsible for the complexity we are observing in biology.

  90. 90
    gpuccio says:

    To all:

    I think that when George Castillo, like many other in the noe-darwinist field, makes a caricature of the argument:

    “How complex this is! It must be designed!”

    he is well aware that when we speak of compelxity we mean functional complexity, and not just absolute complexity.

    Nobody really looks at the disposition of the grains of sand on a beach, or of stains on a wall, and says: Oh, how complex that is, it must be designed!

    And yet, grains of sand and stains have great absolute complexity: you need a lot of bits just to describe their configurations.

    But if we look at a computer, or a watch, or an airplane, then it is perfectly normal to say: how complex it is, it must be designed.

    Because what we see is functional complexity, not absolute complexity.

    Now, the main positions of our interlocutors are usually:

    a) There is no such thing as functional complexity (which, of course, is completely false, otherwise a computer and the stains on a wall would be the same kind of thing).

    b) It’s impossible to measure functional complexity (completely false, look at Szostak or Durston, for example. And I have given measures of functional complexity in biological objects in almost all my posts)

    c) There is no connection between functional complexity and design (completely false, the only examples we know of functional complexity are human designed artifacts and biological objects. There is no known example of functional complexity that originates from a non design system).

    d) Well, even if all known examples of functional complexity whose origin can be traced with certainty are the result of design, still it is possible that the second class of functionally complex objects, biological objects, originated from a non design system.

    I believe that the only vaguely reasonable position for design critics is d). Of course, that position can be held only if supported by some serious attempt at explaining how and why biological objects are the only exception to a very general rule. IOWs, some serious attempt to explain how functional complexity arises in biological object. I am aware of no such attempt. Of course, neo-darwinism is an attempt, but it is certainly not serious. 🙂

    But George Castillo seems to stick to c): there is no connection between functional complexity and design.

    Or, more precisely, if we invoke a designer for designed things “we’d still be banging stones together to cook our dinner”.

    So, I believe that George Castillo is still looking for some credible explanation for computers, watches and airplanes that does not involve design. So that he can cook his dinner more comfortably, maybe using designed machines.

    Oh, I know what his next “argument” will probably be. But I am not going to say it. Let’s wait for him.

  91. 91
    EugeneS says:

    George #77

    You missed out one critical bit, an a priori specification of function.

    Can you give an empirical example of a semiotic relationship (“sign–referent”) arising in inanimate nature outside of an explicit decision making process?

  92. 92
    Eugene S says:

    Bill Cole #86

    A good observation indeed.

    The root cause of this is that evolutionism is a faith. The same faith in ‘all-powerful eternal matter’ as articulated by Epicirus, Plotinus, Origen, Spinoza, Hegel, Lenin and others. The only difference is that today it camouflages as science.

    It is absolutely safe to say that if you meet somebody who claims not to believe in evolution, that person is ignorant, stupid or insane (or wicked, but I’d rather not consider that).

    If that gives you offence, I’m sorry. You are probably not stupid, insane or wicked; and ignorance is no crime in a country with strong local traditions of interference in the freedom of biology educators to teach the central theorem of their subject.

    Richard Dawkins, “Put Your Money on Evolution”. The New York Times Review of Books: p. 35. 1989-04-09.

    A very characteristic revealing and self-contradictory quote indeed linking religious belief (‘evolution’) to science (‘the central theorem’).

  93. 93
    PeterA says:

    I feel really sorry for George Castillo.
    The poor guy had lost the discussion before it started, but he’s not aware of this yet.

  94. 94
    timothya says:

    Bill Cole:

    “Couscous intelligent beings”.

    Heh.

    Sorry, but sometimes autocorrect produces hilarious results. This must have been a Pastafarian intervention. All hail the durum wheat deity.

  95. 95
    George Castillo says:

    Gpuccio, can you point me to the paper where your functional complexity calculation originates from?

  96. 96
    ET says:

    George-

    Start with these:

    Kirk K. Durston, David K. Y. Chiu, David L. Abel, Jack T. Trevors, Measuring the functional sequence complexity of proteins, Theoretical Biology and Medical Modelling, Vol. 4:47 (2007)

    and

    Robert M. Hazen, Patrick L. Griffin, James M. Carothers, and Jack W. Szostak, Functional information and the emergence of biocomplexity , Proceedings of the National Academy of Sciences, USA, Vol. 104:8574–8581 (May 15, 2007).

  97. 97
    George Castillo says:

    Which one of those does gpuccio use to calculate functional complexity/information, I can’t tell?

  98. 98
    ET says:

    You haven’t read either of them. Read them and then ask questions

  99. 99
    George Castillo says:

    Have you read them?

  100. 100
    PeterA says:

    As far as I recall, gpuccio’s method is his own, which evolved through gradual random variations from the ones described in the papers cited by ET. I could’ve misunderstood that though.
    Better wait for gpuccio’s reply.

  101. 101
    R J Sawyer says:

    This is off topic but reading through this, and several other UD themes recently, has suggested a good test to my feeble little mind. Let me explain.

    There are a few themes that frequently resurface here:

    1) ID is a science with a clearly defined theory, it is falsifiable and makes testable claims.
    2) Science is inherently biased against ID (or, against anything that is anti-evolution).
    3) There is systemic censorship in the science community and the media.
    4) Linked to #3, the peer review process is seriously flawed.

    GP, KF and others have put extensive work, thought and effort to produce some very well thought out and presented articles. My suggestion, to test 2 through 4 above would be to have one of these authors draft a publication quality manuscript on the subject of this OP. Preferably, it should be co-authored by a few of the more prolific authors here. It can then be submitted to a journal in one of the respected publishing houses (eg, Springer or Elsevier). If it gets rejected, the author(s) can then write an OP here or at Evolution News, including links to the manuscript and the reviewers comments.

    I don’t see a down side to this little experiment. If it is accepted and published, this information will be widely disseminated amongst the people who can actually make changes to the system. If it is rejected, the reviewers comments and editor’s final decision may shine a light on the mindset of those who control the flow of information in the scientific community. Just food for thought.

  102. 102
    George Castillo says:

    “As far as I recall, gpuccio’s method is his own”

    Why reinvent the wheel?

    “draft a publication quality manuscript…can then be submitted to a journal…I don’t see a down side to this little experiment”

    The authors know their intended audience: laymen.
    Ask a scientist in this field to review what has been posted here and it would be torn apart.
    And I can guarantee they will ask the same questions I did; where did the method of calculating functional info/complexity come from, and also why did you feel the need to invent your own method, and why not use one of the methods that are already published?

  103. 103
    R J Sawyer says:

    GC@102. I agree that the current OP would be torn apart. That is why I suggested that it be re-drafted as a publication quality paper (abstract, introduction, methods, results, discussion, references). It may still get torn apart, but that would be educational in itself. What reasons are they using to tear it apart? It is these reasons that must eventually be addressed if ID hopes to be accepted as a legitimate alternative to evolution.

  104. 104
    PeterA says:

    George Castillo and R J Sawyer,

    How long have you been following gpuccio’s OPs?

  105. 105
    gpuccio says:

    To all:

    Wow, some activity here!

    OK, let’s start with George Castillo.

    I have explained a few aspects about my method to compute functional information in protein in my comment #32 here, in answer to EugeneS.

    What PeterA saya at #100 is correct: the method I use in my OPs and comment is my own. However, it is based on the same principles used by Durston in his paper:

    Measuring the functional sequence complexity of proteins

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217542/

    already quoted by ET at #96.

    The other paper quoted by ET:

    Functional information and the emergence of biocomplexity

    http://www.pnas.org/content/104/suppl_1/8574

    is very interesting too, because it is by non ID friendly authors, Szostak and others, and it deals very seriously with the concept of functional information, its definition and some applications of it, even if it does not apply it to proteins in particular.

    My method is based on the same assumption used by Durston: evolutionary conservation of protein sequence is a marker of functional constraint. That is an assumption that no serious neo-darwinist could deny, because it is at the core itself of neo-darwinian thought.

    What I do is simply to use that conservation to measure functional information using blast, a tool universally used to compare protein sequences, and its homology score as a mesure of functional information, provided that the blast results be between protein which are separated by a very big evolutionary gap: in most of my examples, I have compared proteins between pre-vertebrtaes and vertebrates, therefore separated by more than 400 million years in their evolutionary history.

    So, as you can see, I have not “reinvented the wheel”. I have made my personal wheel, and it works very well.

    Now, you can agree with what I do and with my consclusions, or (much more likely) you can disagree. That’s your personal choice.

    But, if you want to discuss what I do, and have my attention, you have to say something specific about what I do: just declaring that my method is not published in the scientific literature is simply boring: I know, you know, everybody knows. Many times I have said that I have no desire to publish anything about ID in the scientific literature. I publish my ideas here, and that is a definite choice. I have my reasons, and I have also explained them occasionally. And, of course, you are free to say all that you want about my choice. But again, that is simply boring and irrelevant.

    But, as you too are publishing your ideas here and not on a scientific journal, if you want attention to your ideas you must make them interesting. If you disagree with what I do, explain why. All the rest is irrelevant.

  106. 106
    R J Sawyer says:

    GP@105. I used your OP as an example simply because I was reading through it at the time that the idea came to my mind. There are others here who post equally comprehensive and well thought out OPs. KF and Johnnyb come to mind. They could run with it if they wanted to.

    I just think that submitting a manuscript to one of the respected publishers would be a very informative excersise. If a paper is properly drafted, I suspect it would be rejected and the reviews would be along ideological lines rather than a criticism of its scientific merits. Which would be very revealing because I don’t recall anything like this being done before. The risk, obviously, is if it is rejected and the reasons for rejecting it are based completely on its scientific merits. But even that unlikely outcome would be informative as it would provide valuable input as to where the ID concepts must be further researched.

  107. 107
    gpuccio says:

    R J Sawyer:

    Thank you for your kind suggestions.

    Regarding your 4 points, I certainly agree with the first 3. I think you should add the word “current” to “science” at number 2, but for the rest I agree.

    Numer 4 is more complex. Peer review is probably not the main problem. Journals can reject a paper without even submitting it to peer review. They can just say that the paper may be good,but it is simply not what they are interested in.

    Peer review has many problems, and prejudice against ID is certainly not the only one. Perhaps I would not say that it is “seriously flawed”, but it is not very efficient and reliable. One thing is certain, that it does not guarantee that what is published is good, or that all that is good will be published.

    Your “experiment”, while certainly proposed in good faith, is not very practical, IMO. That kind of paper would certainly be rejected, but that would not demonstrate anything. Journals reject a lot of papers all the time, many of them without any peer review process. And so? In most cases, there is no ideological bias behind those decisions, only more or less valid reasons of other kinds.

    Moreover, this OP in particular does not qualify as independent research: it is more a review of the literature, even if made from an ID viewpoint and with some small personal additions. Other OPs that I have published here are more original in their content.

    Finally, as I have said many times, I have no intention to publish in the scientific literature about ID. My personal conviction is that publishing my ideas here is much more appropriate and, in the end, useful.

    I have serious doubts that the recognition of ID as an important scientific paradigm wil happen through a gradual admission of ID friendly papers in the literature, even if that can help a little. My personal idea is that ID will become the main paradigm of science beacuse the scientific world, and all the good components in it, will become tired of defending obviously false ideas after some time, and as the accumulation of facts that prove beyond any doubt that those ideas are false will reach a point where almost anyone will be ashamed to deny the evidence. That will come mainly from “traditional” research. Exactly that kind of research that I quote in my OPs.

    In the meantime, it is important that those people who are already fully convinced of the superiority of the ID paradigm may continue to express their ideas, to reason and discuss things from an ID perspective. The ID point of view is important, precious I would say, and it must be defended.

    Here is a good place to do that, and that’s what I try to do.

  108. 108
    gpuccio says:

    R J Sawyer at #106:

    I think I have already answered at #107 (even if I had not yet read your new comment).

    One point: there is absolutely no doubt that “ID concepts must be further researched”. I am absolutely convinced of that.

    ID is at present only a paradigm. Its specific application to biology is still very limited, and the reason for that is very simple: resources are extremely limited.

    Even the theoretical approach is still in its starting steps: there is still much work to do.

    That’s why I always say that ID is a paradigm: it is not simply a theory (even if it includes many different theoretical approaches), and it is not a movement (even if of course there are some organizational aspects in the ID field). It is a paradigm, a way of thinking inspired by the recognition of the importance of conscious design in the natural world and of the folly inherent in denying it a priori, as current science does.

    But frankly, I don’t think that credible suggestions about how to improve ID thinking will come form peer review. It will come from ID itself, or simply from good science out there.

    In the end, only one thing will promote the ID paradigm as time goes by: the fact that it is true.

  109. 109
    PeterA says:

    Maybe some folks her don’t agree with this, but let’s admit that as George Castillo got some credits for adding some “heat” to the “chromatin” thread, now he should get credits for doing the same here. Perhaps the website administrators should think of some kind of incentive rewards for cases like this?
    🙂

  110. 110
    jawa says:

    Peter,
    C’mon buddy, can you stay serious ?
    There are important issues discussed here.
    Can you leave the jokes for another occasion?

  111. 111
    bill cole says:

    Hi George

    Ask a scientist in this field to review what has been posted here and it would be torn apart.

    Why are you so confident it will be torn apart?

  112. 112
    gpuccio says:

    bill cole at #111:

    Indeed, I think that it is quite accurate as a review of the literature. Not complete, certainly, otherwise I should have spent much more time in preparing it, but I believe that the most important things are there.

    Of course my personal additions would not be welcome. And of course the general approach has been to keep things as simple and clear as possible, so that also non technical readers could get an idea of things.

    That said, I have tried to do the best I could, and to include some recent and not so obvious concepts from the latest research.

  113. 113
    gpuccio says:

    PeterA at #109:

    Always ready to give credit where credit is due. 🙂

  114. 114
    gpuccio says:

    PeterA at #109:

    Jokes apart, of course the lack of comments from people on the other side does not help.

    I understand that this OP, in particular, is mainly a review of what is known, and as such there is not much that can be said against the things presented here.

    But it is equally true that things are presented here in a certain context, because I believe that those things strongly support the ID point of view.

    So, someone could try to explain why that should not be true, possibly something more detailed than the usual:

    “There is nothing in all these things that neo-darwinism can’t explain”.

    For example, some thoughts about how neo-darwinisn is supposed to explain complex regulatory networks, codes, millions of enhancers that control the specificity of transcription, cell differentiation, and so on.

    Or some thoughts about the ever shrinking size of “non functional” DNA, with enhancers, promoters, lncRNAs and so on constantly emerging as big functional parts of the genome.

    Just suggestions…

  115. 115
    R J Sawyer says:

    GP, thank you for your kind response. I will leave you with just one final comment. I don’t want to take this thread too far off topic.

    In the early 20th, Alfred Wegener proposed continental drift to the scientific community, using the science communication means of the day. He knew it would receive much opposition because it went against the commonly held belief. And it did receive extensive push-back. But he kept plugging away, drawing evidence from different fields including geology, biology, botany and palaeontology. What he lacked until the day of his death was a viable mechanism. This was later discovered and formed the field of plate tectonics. My feeling is the best hope for exceptance of ID is to follow a similar approach.

    Thanks for listening.

  116. 116
    gpuccio says:

    To all:

    OK, this is not exactly about transcription, but it is very interesting.

    Widespread evolutionary crosstalk among protein domains in the context of multi-domain proteins

    https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0203085

    Abstract
    Domains are distinct units within proteins that typically can fold independently into recognizable three-dimensional structures to facilitate their functions. The structural and functional independence of protein domains is reflected by their apparent modularity in the context of multi-domain proteins. In this work, we examined the coupling of evolution of domain sequences co-occurring within multi-domain proteins to see if it proceeds independently, or in a coordinated manner. We used continuous information theory measures to assess the extent of correlated mutations among domains in multi-domain proteins from organisms across the tree of life. In all multi-domain architectures we examined, domains co-occurring within protein sequences had to some degree undergone concerted evolution. This finding challenges the notion of complete modularity and independence of protein domains, providing new perspective on the evolution of protein sequence and function.

    The idea presented here is simple and convincing: while it is true that domains are functional modules, it is equally true that when they are used in multi-domain proteins they need to be adapted to the new function, and that adaptation has to be “concerted”.

    Of course they call it “concerted evolution”. I would call it, more realistically, concerted re-engineering of the individual modules for the new meta-function.

    I have seen that happen clearly when I blast proteins. While domains are often greatly conserved in distant proteins, for example the DBDs in TFs, they are also different, and that difference has all the features of a specific functional reengineering. I think that happens not only in new multi-domain proteins, but also in the same protein adapted to a new branch of organisms.

  117. 117
    gpuccio says:

    R J Sawyer:

    And thanks to you for the very interesting contribution.

    What you say is interesting. I think, however, that the opposition against ID has deeper ideological reasons than ID simply going against the commonly held scientific belief. At present, ID goes against the general worldview apparently “supported” by current science. It’s more an ideological (religious?) bias than simply a defence of some scientific opinion.

  118. 118

    Sorry I’m late to the party. It’s a long holiday weekend in my part of the world, and I live on a lake, so you can do the math on that. 🙂

    I see George has re-appeared for some more dismissal of evidence. Let’s see what he has to say:

    1. you have carefully designed the question in a way that will force most people (read: people with little knowledge of biology) to eventually pidgeonhole themselves into saying …

    You apparently believe it’s somehow illegitimate (or trickery) to ask details about how your model of biological origins results in something being the way we find it today. Obviously, I strongly disagree with you on that point, as should anyone and everyone. It’s not a trick question to point out the well-documented features of the system and ask how they came into being under your paradigm. If those features are so distinct and unique that they’ve been described in the literature as the fundamental and necessary conditions of such systems, and if they were predicted in logic and then experimentally confirmed to true, then I think it would be rather careless (fairly stupid) to not ask these questions. Indeed, I would like to think that a person with your level of certitude would anticipate the questions, and be ready with a logical answer. Your response, on the other hand, has been to launch insults and complain about the questions. Perhaps you can explain why you should be exempted from responding to physical evidence.

    2. the entire premise of your question is a strawman as you are unflinchingly rigid in the definition of an aaRs, its functions/roles, and the system itself…

    Alan Turing wrote a paper in the 1940s where he presented a programmable information system that (in order to function) mirrored Charles Peirce’s model (written decades earlier) of a necessary triadic relation between an object, a medium of information about that object, and a discrete interpretation of that medium. Turing’s paper would lead directly to the information explosion we are living in today. John Von Neumann then took Turing’s machine (with its Percean logic intact) and used it to predict the necessary physical conditions of an autonomous self-replicator. Francis Crick et al then experimentally demonstrated the medium of information in DNA, as well as its basic encoding structure. He then went on to predict that a set of discrete objects would be found within the system to serve as the interpretants of the code. These objects were experimentally described later by Hoagland and Zemecnik, confirming Crick as well as Von Nuemann, with Turing and Peirce in tow. After Nirenberg and others cracked the code, setting off the information revolution in biology, Howard Pattee presented the specific material conditions of the system in the language of physics, and described the semantic closure required for the system to function (i.e. Von Neumann’s “threshold of complexity”).

    Since you seem to be suggesting that Von Neumann (Peirce, Turing, Pattee) is wrong about those conditions, perhaps you can tell me where the model is incorrect? In order to function, does the system require both the medium of information and the constraints to interpret it? Does it require semantic closure in order to replicate itself? If you cannot tell me where the established model is wrong, then please do tell me why I shouldn’t refer to it, or why your arguments should be exempted from it.

    but the conversation is in fact about the evolution of the system which occured millenia ago. Your question is not representative of how anyone thinks this system evolved.

    This is assuming your conclusion. This is assuming your conclusion in the face of universal evidence to the contrary, followed by an irrelevant (and fallacious) appeal to authority.

    3. no one knows how the translation system evolved; how information was first encoded in a genome and how that genome was converted into a functional molecule.

    This is just more of the same.

    George, if you intend to answer no questions, to deal with no details, or consider how any evidence might impact your beliefs, then at least try to be entertaining.

  119. 119

    Eugene Selensky is entirely correct; it is the presence of the system and its semantic closure that enables Darwinian evolution to occur, not the other way around.

  120. 120

    GP, I missed where you suggested we might be better off to just ignore George and his endless ideological defenses. Sorry for taking your thread off course.

  121. 121

    I wholeheartedly agree with Bill Cole at #111.

    “Why?”

  122. 122
    ET says:

    Yes, George, I have read the papers.

    Ask a scientist in this field to review what has been posted here and it would be torn apart.

    Nonsense. Ask a scientist in this field to say how blind and mindless processes could have done it and you will see a scientist implode from failure.

  123. 123
    gpuccio says:

    UB at #120:

    Don’t worry. George Castillo has contributed to the discussion, and made it more lively, inspiring good commenters like you and others to clarify important points.

    Unfortunately, good antagonists have become really scarce here, so we are grateful for what we have. 🙂

  124. 124
    ET says:

    Earth to R J Sawyer- There isn’t anything in peer-review that supports evolution by means of blind and mindless processes. And yet that is the mainstream position- that evolution proceeds by means of blind and mindless processes.

    The scientists who reject ID do so for personal reasons. They definitely cannot refute any of its claims and that is very telling. To refute ID all they have to do is find support for their own position and they can’t.

    That means the best hope for ID is to have the old ignorant guard die out

  125. 125
    ET says:

    R J Sawyer:

    It is these reasons that must eventually be addressed if ID hopes to be accepted as a legitimate alternative to evolution.

    Clueless. Intelligent Design is NOT anti-evolution– so learning what the debate is all about would be a good place to start. And unlike blind watchmaker evolution ID makes testable claims.

    So the question is what is a scientifically viable alternative to Intelligent Design?

  126. 126
    gpuccio says:

    ET:

    Well, I can live with the potential terror of being torn apart by some imaginary peer review.

    In the meantime, I am just waiting to be torn apart by George Castillo, or by any other interlocutor here. After all, they (like you and the other friends) are my peers: we write on this forum without any pretense of authority, and the things we say are the only stuff that counts.

  127. 127
    gpuccio says:

    ET:

    “That means the best hope for ID is to have the old ignorant guard die out”

    Sad but true.

  128. 128
    ET says:

    gpuccio- I understand your point as constructive criticism, the kind offered by scientists who care about their craft, is always a good thing.

    Here you are, acting in good faith and actually presenting an argument reasoned from the evidence. Doing what “they” say we cannot, have not and will not do- yes even in the face of the overwhelming evidence to the contrary. And all “they” can do is flail away and then puke out a “threat” of someone, someday, gonna tear it apart.

    It would be hilarious if it wasn’t so pathetic.

  129. 129
    George Castillo says:

    “So, as you can see, I have not ‘reinvented the wheel’. I have made my personal wheel…..”
    Uhh, what?……is it square by any chance?
    Let’s just put that little tidbit to the side for now.

    Wouldn’t it make more sense to just use the published methods for calculating functional bits?
    Do you have a good reason not to use that method?
    Why do you think your method is better?

  130. 130
    OLV says:

    gpuccio,

    Delightful presentation of such a fascinating topic!

    I’m enjoying it.

    Thanks.

  131. 131
    OLV says:

    EugeneS,

    Excellent comments, as usual. Thanks.

  132. 132
    OLV says:

    UB, ET, bill cole:

    I like your valuable contributions. Thanks.

  133. 133
    OLV says:

    129 George Castillo,

    gpuccio has explained his clever method more than once in this website.

    You may want to read it to understand it well, before you can comment on it. Just a suggestion.

    I’m sure gpuccio will enjoy discussing that topic.

  134. 134
    OLV says:

    102 George Castillo,

    “laymen”? “torn apart?”

    this is an open website that can be read by anybody with internet access that is interested in the discussed topics.

    here’s a case where a scientist tried hard to tear apart gpuccio’s presentation a couple of years ago:

    Who?

    ID debate with a professor 11, 14, 18, 25, 26, 27, 33

  135. 135
    OLV says:

    102 George Castillo,
    “laymen”? “torn apart?”

    Here’s another debate between gpuccio and a scientist who tried unsuccessfully to tear apart gpuccio’s presentation: 25, 50, 51, 56, 130, 164

    After seeing such an embarrassing failure, do you think another scientist would like to go through a similar experience here?

    You tell us.

  136. 136
    OLV says:

    George Castillo,

    You may want to verify what you write before you post a comment here. Just a friendly suggestion.

  137. 137
  138. 138
    gpuccio says:

    George Castillo at #129:

    All wheels are not the same. Without being square, they differ in size, structure, materials, specific purposes, and so on.

    My method and Durston’s both rely on evolutionary conservation to measure functional complexity. Therefore, they are both wheels. The basic idea is the same.

    However, the way to use sequence conservation as an estimate of functional information is different.

    I have already discussed that here with EugeneS and referred you to that discussion (at #105).

    However, for your convenience, I quote here myself from #32, the relevant part:

    The Durston method is different because he starts with a selected group of homologue proteins in different species, then does a multiple alignment of all of them, and applies a definite computetion of the reduction of uncertainty for each aminoacid site, based on Shannon’s formula. IOWs, he is comparing the variance at each aminoacid site in a set of proteins restrained by a common function with the theoric variance for random sequences, where each AA site can be occuoied by each AA. He then sums the values for each site to get a globbal functional information value for that protein family.

    In that case, one absolutely conserved AA contributes as 4.3 bits of functional information, while a site which has a random variance of AAs contribute 0 bits, and then there are all possible intermediate situations.

    Durston’s method is very good, but it is much more difficult to apply: you have to choose your set of proteins, align them and do the computations for each site. It has its potential biases too, because of course the choice of the sequences, and the manual review of the alignment, are very important.

    In a sense, my method is easier to apply, and easier to verify by anyone. It just requires a correct use of the blast algorithm and of the available protein databases at the blast site.

    The choice of considering the best hit for each group of organism is very reliable, because at the levels of exponential improbabilities that are interesting for ID the simple existence of some high homology for more than 400 million years is an undeniable sign of extreme functional constraint. If there are no errors in the database, it’s impossible that such high exponential values of homology may be due to any random variance.

    The choice of using human proteins as probes to measure functional information has its definite reasons, as I have explained in my previous comments.

    And this is from my comment #27:

    The important point is: I never reason in terms of absolute information content, only in terms of functional information. Indeed, all the “jumps” I analyze and discuss are jumps (deltas) in functional information.

    Why? Because I have applied a special procedure, that I have tried to explain in some detail when possible.

    What I measure is “human conserved information”, IOWs the bits of homology to the human form of the protein. Another way to say it is that I use human proteins as “probes” to measure the evolutionary history of proteins in relation to the form that the protein assumes in humans

    I could as well use as probes the proteins in bees, for example (indeed, I have done that in some cases, for specific comparisons). My choice of huma proteins as “measuring probes” has, however, a few important motivations:

    a) Of course, we are naturally interested in human functions

    b) The human proteome is probably the best investifated and reviewed

    c) Human are a very recent species, so human proteins can be considered, in their final form, a recent result

    d) I am specially interested in the transition to vertebrates, and humans are recent vertebrates. So, the time distance from the originary split to vertebrates (and then from cartilaginous fish to bo0ny fish) and the recent split to humans is more than 400 million years

    So, why is a jump in human conserved information observed after the slit to vertebrates (in my graphs, that is the transition from non vertebrate deuterostomes to cartilaginous fish) a good measure of a variation in functional information?

    Well, the reasoning is simple, and it relies on assumptions that cannot be easily denied, especially by neo-darwinists.

    If some specific sequence appears for the first time after the split to vertebrates, and is then conserved for more than 400 My, then we can safely assume that the sequence is functional. Indeed, the measure that it is conserved (IOWs, the bits of information that appear for the first time in vertebrates and are conserved to humans) is a measure of its functional restraint.

    So, if a protein homolog a some huma protein (let’s call it A) has a maximum homology hit with the human form, before the appearance of vertebrates, of say 300 bits, and then we find 1000 bits of homology in cartilaginous fish, then we can say that 700 bits of human conserved, and therefore functional, information have appeared at the transition to vertebrates. If theprotein A is, say, 1000 AAs long, that is a jump of 0.7 bits per aminoacid site (about one third of the total information content of the protein in a blast comparison).

    The 400 million years gap is important: indeed, it is an evolutionary distance that guarantees that any non functional sequence homology will be completely cancelled by neutral variation, as can be easily seen in synonimous sites. Therefore, any homology that is conserved for such a time is under extremely strong functional constraint.

    So, I hope that I have explaine my “methods of calculating functional information in an amino acid sequence”, your b) question. This is not the only way to do it, of course. The Durston method, that has inspired all my reasonings, is slightly different. However, I find this method quick and reliable.

    I have also explained in detail some of my procedures here:

    Bioinformatics tools used in my OPs: some basic information.

    https://uncommondescent.com/intelligent-design/bioinformatics-tools-used-in-my-ops-some-basic-information/

    And the procedure itself is explained in some further detail in this OP, which is the first where I have extensively applied it:

    The highly engineered transition to vertebrates: an example of functional information analysis

    https://uncommondescent.com/intelligent-design/the-highly-engineered-transition-to-vertebrates-an-example-of-functional-information-analysis/

  139. 139
    DATCG says:

    Gpuccio, thanks, will try to catch up as I can.

    My weekend ended up different than planned with cousins in coming town.

    But I see you have things well in hand as usual 🙂

  140. 140
    DATCG says:

    Upright Biped @53,

    re: Chromatin post

    Yes, much puffery, insults and paper bluffing. I read 1st one GC listed on the Chromatin thread. Found it to be usual stuff with usual caveats of Darwinist propaganda thrown in, with usual claims and appeals using “could be” “might be” yada yada, thrown in, but nothing overturning the failure of neo-Darwinism to account for life by blind, unguided steps of the kind leading to macro-form evolutionary events.

    What’s so funny is GC is defending today what many Darwinist are now admitting is weak, failed, needs replacing or even by the staunchest of Darwinist defenders, at least be updated. Especially since the findings of ENCODE have torn asunder the House of Darwin.

    Royal Society has proposed major changes and openly admitted the failures of neo-Darwinism.

    Many scientist today openly admit these failures. For example at Royal Society. Not to go off-topic, but really the problems have yet to be solved.

    As reported by Paul Nelson and David Klinghoffer on the meeting of the Royal Society in 2016…

    The opening presentation at the Royal Society by one of those world-class biologists, Austrian evolutionary theorist Gerd Müller, underscored exactly Meyer’s contention. Dr. Müller opened the meeting by discussing several of the fundamental “explanatory deficits” of “the modern synthesis,” that is, textbook neo-Darwinian theory. According to Müller, the as yet unsolved problems include those of explaining:

    Phenotypic complexity (the origin of eyes, ears, body plans, i.e., the anatomical and structural features of living creatures); Phenotypic novelty, i.e., the origin of new forms throughout the history of life (for example, the mammalian radiation some 66 million years ago, in which the major orders of mammals, such as cetaceans, bats, carnivores, enter the fossil record, or even more dramatically, the Cambrian explosion, with most animal body plans appearing more or less without antecedents); and finally Non-gradual forms or modes of transition, where you see abrupt discontinuities in the fossil record between different types.

    As Müller has explained in a 2003 work (“On the Origin of Organismal Form,” with Stuart Newman), although “the neo-Darwinian paradigm still represents the central explanatory framework of evolution, as represented by recent textbooks” it “has no theory of the generative.”

    In other words, the neo-Darwinian mechanism of mutation and natural selection lacks the creative power to generate the novel anatomical traits and forms of life that have arisen during the history of life.

    Yet, as Müller noted, neo-Darwinian theory continues to be presented to the public via textbooks as the canonical understanding of how new living forms arose – reflecting precisely the tension between the perceived and actual status of the theory that Meyer described in “Darwin’s Doubt.”

    Yet, the most important lesson of the Royal Society conference lies not in its vindication of claims that our scientists have made, gratifying as that might be to us, but rather in defining the current problems and state of research in the field. The conference did an excellent job of defining the problems that evolutionary theory has failed to solve, but it offered little, if anything, by way of new solutions to those longstanding fundamental problems.

    Meanwhile, Dan Graur steams with his “Junk” DNA meltdown and proclamations that at least 75% of DNA must be “JUNK.” Of course this is based upon neo-Darwinist rhetoric by Dan Graur himself – doubling down on old assumption.

    https://evolutionnews.org/2017/07/dan-graur-anti-encode-crusader-is-back/

    From Evolution News…

    “We read the paper, and looked over Graur’s accompanying PowerPoint. We’re not impressed by theoretical population genetics because it is based on neo-Darwinian assumptions rather than biological realities.

    Basically, he is using that circular science to add a quantitative gloss to his fundamental position, namely that if ENCODE is right then evolution is wrong, and evolution can’t be wrong, so ENCODE can’t be right.”

    Thanks Dan Graur – “If ENCODE is right, then evolution is wrong” of course meaning Darwinist evolution of blind, unguided mutations is wrong, not however if guided.

    GC is living in the past much like Graur. Assumptions made upon failed speculations of the past, largely based upon ignorance. And unfortunately text books still teaching the failures of the past along with scientist in research journals maybe, coulda, mighta, … happened in the past 😉

    Darwinism = Story Telling.

    ENCODE is a game changer. Dan Graur has every right to be fearful and angry. He knows the stakes.

    As he said and is worth repeating, “If ENCODE is right, then evolution(blind, unguided) is wrong.”

    Have a good day guys. And sorry Gpuccio for going off topic 🙂

    The incredible amount of new functions once written off as “Junk” that will be found will continue to undermine people like Graur and Castillo’s claims. We’re just at the beginning of all of this…

    https://evolutionnews.org/2017/09/design-in-the-4th-dimension-the-4d-nucleome-project/

  141. 141
    DATCG says:

    Gpuccio @5

    The Ubiquitin Magic hat PTM trick I see is working 🙂

    Post Translation Modification – just magically poofed into being by a series of blind, unguided events 😉

    Let’s go now to post-transcriptional modifications (PTMs) of histones. This is certainly the moast relevant epigenetic level of transcription regulation.

    In brief, histones have “tails” that can be modified by attaching various kinds of groups to them. So, each of the four histone types in the nucleosome (usually H2A, H2B, H3 and H4) can be methylated or polymethilated, acetylated, phosphorilated, ubiquinated, sumoylated, biotinylated and many other things, at different aminoacid sites, usually lysines or arginines. The combinatorial result is that more than 150 different histone PTMs have been described.

    Ubiquitination requires recognition does it not? To target and mark for PTM? Forgive me, can’t remember if recognin’s are involved with Histones or not.

    I forget all the steps you listed Gpuccio on the other link to the great Ubiquitination Post.

    Anyways, yep, amazing stuff again being coordinated by specific actions and reactions to events. Not accidental mutations, but by targeted guidance systems.

    OK, must run.

  142. 142
    DATCG says:

    Interesting…

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6033341/

    “Thus, UHRF1 can be considered as a master regulator of TSGs as it coordinates DNA methylation and histone modifications at their promoters [13, 14, 17–19]”

    We found that several E3 ligases appeared in the UHRF1 complex, while several deubiquitinases moved from the complex upon TQ treatment (Table ?(Table1).1). However, we did not observe the presence of SCF (?-TrCP) that has been identified as being an E3 ligase that ubiquitinates UHRF1 upon DNA damage [43]. In contrast, we found three E3 ligases that were significantly enriched in UHRF1 complexes upon treatment with TQ (Table ?(Table1).1). These E3 ligases were UBR5 (Ubiquitin Protein Ligase E3 Component N-Recognin 5), DDB1-CUL4A and HUWE1. Two other E3 ligases, namely UBR4 and RING1, were found but only in the UHRF1 complex purified from TQ-treated cells and in a weak amount (Table ?(Table1).1). UBR5 contributes to tumor initiation and progression [44]. CUL4A-DDB1 tandem functions as an oncogene [45, 46], HUWE1 has rather a tumor suppressor activity [47, 48]. In parallel, we observed the presence of several deubiquitinases, among which the major was USP7 confirming previous reports that this is the major deubiquitinase regulating UHRF1 stability [38, 39, 49, 50]. Considering that several E3 ligases have been identified, we first intended to determine the putative contribution of the auto-ubiquitination activity of UHRF1.

    A TF was mentioned as well, ReIB…

    Accordingly, it has recently been shown that UHRF1 down-regulation in cancer cells induced caspase-8 dependent apoptosis and the activation of caspase-3 [53]. We suggest, that auto-ubiquitination inactivates UHRF1, mimicking a down-regulation, and by this way activates caspase-3, which further activates its degradation. Consistently, with this, polyubiquitination does not always associate with proteosomal degradation. as the activity of the transcription factor ReIB was shown to be enhanced by polyubiquitination [54].

  143. 143
    George Castillo says:

    Got it, gpuccio, so you have reinvented the wheel simply to have a method that is “easier to apply.”
    That would surely go over great with any reviewers!

    Have you at least done a comparison to Durston to see just how bad your method is?

  144. 144
    ET says:

    No, George, clearly all you have is your belligerence and ignorance.

    That will never go over with anyone

  145. 145
    OLV says:

    DATCG,
    Very informative posts, as usual.
    Thanks.

  146. 146
    gpuccio says:

    George Castillo:

    It’s not easy to make a comparison. Durston gives values only for 35 protein families, and most of them are not comparable to human and vertebrate proteins, or there is not enough information to identify correctly what domain was used for the alignements.

    In the very few cases where I could make a reasonable comparison, my method definitely underestimates functional complexity as compared to Durston’s (in average, my results are 60-70% of his). That’s what I have always said, and is connected to how the blast algorithm works. So, my method seems to be more favourable to newo-darwinists.

    But again, the cases I could examine are really just a handful.

    If you prefer to believe that Durston’s method is much better, be my guest. His values are higher and worse for your cause.

  147. 147
    gpuccio says:

    DATCG:

    Very good contributions, as usual. Thank you. 🙂

    Now I have no time, but I hope I can comment more in detail tomorrow.

  148. 148
    George Castillo says:

    Gpuccio, if I’m not mistaken, yours and Durston’s method are solely dependent on known sequences.
    Surely you could produce some correlation plots that compare functional information of proteins based on your method versus Durston’s.

  149. 149

    George, you directed your post at 65 towards me, and after I responded, you’ve gone mute. In order to function, does the system require both a medium of information as well as the set of constraints to interpret it? Did Turing’s machine require both the tape and the state transformations? Does the system require semantic closure in order to replicate?

    In his memoirs, pioneering biologist and Nobel Laureate Sydney Brenner commented, “you would certainly say that Watson and Crick depended on von Neumann, because von Neumann essentially tells you how it’s done.”

    What say you? Was Brenner wrong, along with Von Neumann, Turing, Peirce, and Pattee? Was Crick’s prediction a logical one? Could Nirenberg have calculated the discontinuous association of the gene code, or did he have to demonstrate it? If you cannot articulate any details where the established model of translation is wrong, then why should I not refer to it? Why should the materialist’s claims be exempt from it?

  150. 150
    gpuccio says:

    George Castillo:

    As said, it is difficult to compare the numbers in Durston’s table to mine. I could do that in some way for 8 proteins. Even with that small number, there is a very good correlation (p = 0.00007752, adjutsted R square = 0.9273).

    I have neither the time nor the resources to compute Durston’s values for a new set of proteins, so that will have to suffice.

    I am adding a scatterplot to the OP.

  151. 151
    gpuccio says:

    To all:

    Here is an absolutely recent (September 2018) review about the known mechanisms by which lncRNAs implement their functions and are involved in cancer. I highly recommend it:

    Exploring the mechanisms behind long noncoding RNAs and cancer

    Abstract:

    Over the past decade, long noncoding RNAs (lncRNAs) have been identified as significant players in gene regulation. They are often differentially expressed and widely-associated with a majority of cancer types. The aberrant expression of these transcripts has been linked to tumorigenesis, metastasis, cancer stage progression and patient survival. Despite their apparent link to cancer, it has been challenging to gain a mechanistic understanding of how they contribute to cancer, partially due the difficulty in discriminating functional RNAs from other noncoding transcription events. However, there are several well-studied lncRNAs where specific mechanisms have been more clearly defined, leading to new discoveries into how these RNAs function. One major observation that has come to light is the context-dependence of lncRNA mechanisms, where they often have unique function in specific cell types and environment. Here, we review the molecular mechanisms of lncRNAs with a focus on cancer pathways, illustrating a few informative examples. Together, this type of detailed insight will lead to a greater understanding of the potential for the application of lncRNAs as targets of cancer therapies and diagnostics.

  152. 152
    gpuccio says:

    To all:

    The introduction of the paper quoted at #151 is a really good summary of the general features of lncRNAs.

    a) It gives the actual number at about 59000, and the typical range at 1000–10,000 nucleotides.

    b) It reminds us that they are very much similar to protein coding mRNAs, because “they are generally transcribed by RNA polymerase II, 5? capped, 3? polyadenylated, and often undergo splicing of multiple exons via canonical genomic splice motifs”.

    c) It explains clearly the 4 main types of lncRNAs (see Fig. 1):

    – Intergenic

    – Bidirectional

    – Antisense

    – Sense overlapping

    IOWs, lncRNAs are transcribed from all possible parts of the genome, in practically all possible ways.

    d) It also mentions, as a separate category, the enhancer transcribed lncRNAs (eRNAs).

    e) It reminds us that “Despite minimal overall sequence conservation across species, many lncRNAs have evolutionarily conserved function, secondary structure, and regions of short sequence homology”

    f) It reminds us that their transcription is often regulated by “well-studied transcription factors and epigenetic marks”

    g) It reminds us that their expression “is often unique to specific cell types, tissues, developmental time frames, and disease”

    h) It reminds us that the small percentage of lncRNAs that have been studied in depth “have been implicated in X chromosome inactivation (Xi), genomic imprinting, nuclear compartmentalization, splicing, stem cell pluripotency, cell cycle progression, cellular reprogramming, apoptosis, and many diseases”

    i) It reminds us that lncRNAs effect their functions “through the regulation of gene expression, translational control, structural cellular integrity, protein localization and degradation”

    j) It reminds us that they “can associate with a wide range of interaction partners including RNA binding proteins (RBPs), transcription factors, chromatin-modifying complexes, nascent RNA transcripts, mature mRNA, microRNA, DNA, and chromatin”

    Wow! 🙂

  153. 153
    gpuccio says:

    To all:

    But how do lncRNAs really implement their functions?

    That’s exactly the real subject of the paper quoted at #151. It lists some very interesting modalities, each of them well documented by known examples:

    a) They serve as guides “for the proper localization/organization of factors at specific genomic loci for regulation of the genome”. IOWs, they “bind to regulatory or enzymatically active proteins, such as transcription factors and chromatin modifiers, to direct them to precise locations in the genome”.

    b) They serve as dynamic scaffolds, providing a central platform, often short-lived, “for the transient assembly of multiple enzymatic complexes and other regulatory co-factors”.

    c) They work as decoys, “sequestering RNA-binding proteins, transcription factors, microRNAs, catalytic proteins and subunits of larger modifying complexes”, and limiting their availability.

    Quite a range of different, interesting, creative and very specific mechanisms, I would say. And this is only what we understand today.

  154. 154
    gpuccio says:

    To all:

    The rest of the paper quote at #151 is dedicated to the well proven role of many lncRNAs in various kinds of cancer. I will not go into detail about that, but just have a look at Fig. 3 in the paper if you want to get an idea of how complex is this subject, even with the little we know at present.

  155. 155
    gpuccio says:

    DATCG at #142:

    Wow, E3 ubiquitin-protein ligases that “commit auto-ubiquitination” under the menace of anticancer phytochemicals, TFs whose activity is enhanced by polyubiqutination…

    Truth is definitely stranger than fiction!

  156. 156
    DATCG says:

    Gpuccio @155,

    Thought you might like that 🙂 The coordination is amazing as is often the timing. There’s a window of time for all of these interactions taking place or it becomes to costly, inefficient, even lethal as multiple process dependent layers require completion and notifications during each procedural event.

    On your scatter plot – nice! I thought your explanation was sound and reasonable prior to providing it.

    As I read above the scatter plot, noticed your point on Histone Code.

    “Beyond its extraordinary functional complexity, this regulation network also uses at its very core at least one big sub-network based on a symbolic code: the histone code. Therefore, it exhibits a strong and complex semiotic foundation.”

    “… at least one big sub-network based upon a symbolic code: the histone code.”

    Which as we’ve seen interacts with multiple other networks, sub-networks and codes. Like the Ubiquitin Code.

    Yes 🙂 TFs activity enhanced by “poly-ubiquitination” and here are TF’s regulated by ubiquitination…

    Abstract
    Ubiquitination is a post-translational modification that defines the cellular fate of intracellular proteins.

    It can modify their stability, their activity, their sub-cellular location, and even their interacting pattern.

    This modification is a reversible event whose implementation is easy and fast. It contributes to the rapid adaptation of the cells to physiological intracellular variations and to intracellular or environmental stresses.

    E2F1 (E2 promoter binding factor 1) transcription factor is a potent cell cycle regulator.

    It displays contradictory functions able to regulate both cell proliferation and cell death.

    Its expression and activity are tightly regulated over the course of the cell cycle progression and in response to genotoxic stress.

    I discuss here the most recent evidence demonstrating the role of ubiquitination in E2F1’s regulation.

    Here’s the link:
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5666869/

    Regulation of E2F1 Transcription Factor by Ubiquitin Conjugation

    Laurence Dubrez1,2

    The dance of life is a series of highly regulated, highly coordinated networked and sub-network systems that are transcribed along with post-translation-modification and regulation.

    The only way a system knows what to modify – post translation is to have a target pattern to match it. The Codes and Presets exist prior to the post-translation modifications. The information is already existent. Semiotics is real and obvious in the Interactome and interactivity between shared systems networks within the nucleus and across it’s boundary.

    It’s not blindly allowing mutations to appear. There’s not a magical set of new mutated information waiting to be called upon to effect change to forms or cellular processing.

    What we see is highly organized, prescriptive, preventive measurements and Coded, Conditional Reactive States just like any programming technique by Design.

    There is of course sometimes a purpose for built-in random allowances, like the immune system.

    But usually mutations to these networked systems are deleterious as a mutation enters any PTM phase, Code, or interaction. And might be fatal.

    Gpuccio, you end with this …

    “So, the last question could be: can all this be the result of a neo-darwinian process of RV + NS of simple, gradual steps?”

    IMHO, no. First gradualism was killed off long ago I thought, but that’s another trail to take.

    Next, this neo-Darwinian “process” of RM* + NS is insufficient for creation of novel forms, much less the network systems we see today within the cells, that must be coordinated interactions and timing with multiple sub-networks and coded systems, any one of which might be compromised by a single deleterious mutation.

    Far from being an innovator, Darwinism is a stopper, as is evidenced by the findings of ENCODE and thousands upon thousands of new papers since ENCODE discovering more new functions 24/7 around the globe. Debunking “Junk” DNA daily and killing the blind-folded messengers of Darwinism.

    *RM – note I utilized random mutation, not random variation although Gpuccio I agree Variation is the key, allowable variation within constraints I think.

    Darwinist always use random mutation, but since ENCODE has dawned and the light shined in on these incredible Information Networks within the cells, the fact they keep thinking random mutations will be an innovative force for survivable new forms is extremely perplexing.

    It’s like Dan Graur and the old school cannot comprehend the science in front of them today and are so stuck in the past, they keep clinging to antiquated, ignorant beliefs.

    To credit others, though, many are moving on despite those like Graur who will not let go of their blind security blanket.

  157. 157
    gpuccio says:

    DATCG:

    It’s absolutely evident that RM + NS cannot explain any of the things descrbed in the OP and in the following comments. I just thought that it was redundant to say it again.

    Dan Graur! Never seen any better example of blind dogmatism and bias.

    Yes, ENCODE, FANTOM and others are marching on. Because, very simply, they are on the right side: the side of science and truth.

    The functions of non coding DNA and RNAs are no more a hypothesis, least of all a rare exception: they are the absolute, all pervading rule.

  158. 158
    DATCG says:

    Gpuccio 157,

    Well said, indeed, and the pace is ever increasing for new functions found – in “junk” DNA

    🙂

  159. 159
    DATCG says:

    Forgot to add the following Intro paragraph from the TF paper I posted in #156…

    E2F1 (E2 promoter binding factor 1) is the founding member of an evolutionarily conserved family of transcription factors that play critical roles in the regulation of the cell cycle and apoptosis.

    It was identified in 1987 as a cellular factor able to bind a sequence element in the adenovirus E2 promoter [1,2,3]. Research into its cellular functions and regulation mechanisms soon led to the finding that E2F1 is a downstream target of the pocket protein Retinoblastoma (Rb) [4,5,6,7]. This discovery linked E2F1 to the cell cycle and paved the way for extensive studies aimed at determining the role of E2F1 in cell cycle regulation, analyzing regulatory mechanisms that control E2F activity and identifying E2F target genes. At present, the E2F family contains 10 members characterized by a highly conserved DNA binding domain (DBD).

  160. 160
  161. 161
    DATCG says:

    Hey Upright, hope you’re doing well 🙂

  162. 162
    OLV says:

    It’s very difficult to catch up with the growing amount of interesting information posted in this thread… gpuccio keeps adding more insightful comments pointing to juicy papers and now to make things “worse” DATCG appeared out of the blue sky and started to flood this thread with interesting comments also pointing to interesting papers.

  163. 163
    EugeneS says:

    OLV @162

    “It’s very difficult to catch up with the growing amount of interesting information”

    Yes, I agree. This is why I have been arguing for a long time for an index by author on this blog.

    Also, maybe it’s even worth coming back to comment rating because there are some really brilliant comments.

  164. 164
    gpuccio says:

    To all:

    Where do the strangest things happen?

    Of course, in the brain.

    This is about the mouse:

    The Evf2 Ultraconserved Enhancer lncRNA Functionally and Spatially Organizes Megabase Distant Genes in the Developing Forebrain

    Abstract:

    Gene regulation requires selective targeting of DNA regulatory enhancers over megabase distances. Here we show that Evf2, a cloud-forming Dlx5/6 ultraconserved enhancer (UCE) lncRNA, simultaneously localizes to activated (Umad1, 1.6 Mb distant) and repressed (Akr1b8, 27 Mb distant) chr6 target genes, precisely regulating UCE-gene distances and cohesin binding in mouse embryonic forebrain GABAergic interneurons (INs). Transgene expression of Evf2 activates Lsm8 (12 Mb distant) but fails to repress Akr1b8, supporting trans activation and long-range cis repression. Through both short-range (Dlx6 antisense) and long-range (Akr1b8) repression, the Evf2-5’UCE links homeodomain and mevalonate pathway-regulated enhancers to IN diversity. The Evf2-3′ end is required for long-range activation but dispensable for RNA cloud localization, functionally dividing the RNA into 3′-activator and 5’UCE repressor and targeting regions. Together, these results support that Evf2 selectively regulates UCE interactions with multi-megabase distant genes through complex effects on chromosome topology, linking lncRNA-dependent topological and transcriptional control with interneuron diversity and seizure susceptibility.

    The paper is paywalled.

    However, in brief:

    a) The actor is Evf2, a lncRNA transcribed from a non coding region between two homoebox Tf genes, Dlx5 and Dlx6.

    b) The non coding region from which Evf2 is transcribed is an ultra-conserved enhancer (UCE). So, Evf2 is an UCE-lncRNA.

    c) This lncRNA is expressed at sites of sonic hedgehog-activated interneuron (IN) birth in the mouse embryonic forebrain, and has many other functions, acting in part together with other homeobox TFs.

    d) To make it simpel, this lncRNA (Evf2) controls in trans the interactions between the original DNA sequence, the UCE and many other important genes, in an unprecedented range of 31 Mb, and in very complex patterns, through complex effects on topology.

    e) The function of Evf2 is of paramount importance in the development of the mouse brain, as shown by multiple evidences.

  165. 165
  166. 166
  167. 167

    We’ll, it appears my last post is stuck in the filter. Oops.

  168. 168
    gpuccio says:

    UB:

    Thank you so much for the very good work. It should have been me doing it! 🙂

  169. 169
    Eugene S says:

    Upright Biped,

    Thanks for the brilliant post on the history of the theoretical underpinnings of the biosemiotics argument for ID and for a list of OPs by GP. That is really great! In my blog, I have a special tag dedicated to GPuccio 😉 I have something to add to it.

    I have one comment that sort of coagulated only recently. A majority, if not all, of these people you mentioned who undoubtedly substantially contributed to an understanding of the semiotic core of life, were naturalists in the sense that they augmented the sign as an add-on to a description of living organisms as physical systems.

    We, supporters of ID, see the establishment of symbolic boundary conditions on the dynamics of matter in living systems, as a hallmark of conscious design.

    They remained naturalists. They believed that life could be modeled as a Turing machine, or equivalently, as a cellular automaton that can be described using something like:

    state(t+1) = F(state(t=0), state(t)).

    I am not aware of anyone of them openly supporting ID in the strong form we are discussing here (maybe I am wrong). At best, they could probably subscribe to the weak ID positing that for life to appear it is enough to have serendipitous starting conditions.

    To my knowledge, they never supported ID. And some of them openly denounced ID. If I remember rightly from what I read, H.Pattee was one of them. As far as H.P is concerned, his latest addition to the list of his publications on academia.edu is an example. In the section where he gives examples of evolvability, he says:

    The most salient example is the evolution of symbolic languages at two very different scales: the genetic language to describe DNA sequences (and proteins) using nucleotides (and amino acids), and the symbolic language spoken by humans to express and communicate complex ideas. These languages are, to the best of our knowledge, the only languages that possess such great apparently open-ended descriptive power. But these languages must have evolved from much simpler, less open-ended languages. The evolution of such biosemiotic mechanisms must have played an essential role in enabling the open-endedness that followed.

    H. Pattee, H. Sayama: Evolved Open-Endedness, Not Open-Ended Evolution

    Emphasis mine. In these examples he is not convincing because it is his belief, not a clear objective demonstration.

    True, H.Pattee acknowledges elsewhere that for this to happen, life needs to start from an open-ended semiotic system. However, he never subscribed to an ID position (in what I read).

    David Abel emphasized it (personal communication). All these great minds were and are Darwinists in the wide sense as far as the origin of life is concerned.

    I totally agree with you regarding the implications of their reasoning in terms of ID but we need to bear this in mind.

    This is one of the reasons why I find what GP writes very important.

  170. 170
    Eugene S says:

    UB

    Just in addition to my latest post. Thankfully, these people’s contribution to science (philosophy aside) is objective 😉 And we can assess the power of their modeling by tangible advances of technology that followed.

  171. 171

    Evgeny,

    Thank you very much for your 169 and 170. It will be this evening (US) before I can respond.

  172. 172
    gpuccio says:

    EugeneS:

    Very good thoughts at #169.

    Yes, those people were not supporters of ID. But we must consider that th ID paradigm, even if always present in some form, has only recently gained more relevance and strength. 10, 20 or 30 years ago the power of materialistic naturalism in science was practically absolute. Now it is still the dominant religion, but there is some valid opposition, luckily.

    Regarding the passage from Pattee that you quote, it is a good example of how good ideas can be used badly.

    First there is the basic admission that only biological objects and humans implement complex symbolic languages. Which in itself should mean a lot. Then there is the false reasoning that they “must have evolved”.

    But again, there is no mention of the fact that human language, even if it evolved, is the result of consciousness, of its intuitions, of its ability to understand and to generate symbols and connections.

    While the “evolution” of biological objects, and therefore of the complex codes implemented in them, is supposed to be consciousness independent, mindless, purposeless, devoid of any understanding.

    There is a big, enormous difference, and yet the two examples (the only two examples in the known universe) are happily put together as though they were simply two different, and self-supporting, examples of the same process.

    This is what dogma and prejudice can do, even to the best minds.

    By the way, a tag?!!

    I am really honored! 🙂

  173. 173
    Eugene S says:

    GP

    “Which in itself should mean a lot.”

    I can relate to what you write in comment 172.

    “But again, there is no mention of the fact that human language, even if it evolved, is the result of consciousness, of its intuitions, of its ability to understand and to generate symbols and connections.”

    That is probably the greatest flaw in the whole edifice of contemporary scientific thought. In fairness, biosystems themselves are decision making systems (even if unconscious). But attributing all this starting complexity of life to non-telic (‘mindless’) factors is what escapes me. The complexity explosion(s) they keep talking about are inexplicable without conscious design.

    As you rightly pointed out in the discussion under one of your OPs, the seemingly simple rules of cellular automata already implicitly encode the resultant complex behaviour. Nothing comes out of nothing in reality, putting the illusory world of an evolutionist aside.

    “I am really honored!”

    Yes, you are now famous in the Russian blogosphere. Trouble is, I am not that famous myself 😉 Anyhow, whoever reads my posts can read yours!

  174. 174
    Eugene S says:

    Regarding the glaring gap between the castles of smoke and reality… Hawking was quoted as saying that all that is necessary to explain the world is gravity. And then, when asked where gravity came from, he seriously answered: “From the M-theory!”

  175. 175
    OLV says:

    Eugene S:

    Would you mind sharing a link to your blog here?

    Thanks.

  176. 176
    PaoloV says:

    Eugene S,

    Oxford University professor John Lennox has said that nonsense remains so regardless of who says it.

  177. 177
    gpuccio says:

    To all:

    One of the big unsolved questions in this huge issue of transcription regulation is: what controls cell differentiation?

    I think the only answer still is: we really don’t know.

    However, a lot of new precious information is available.

    While this issue probably deserves some future detailed discussion, maybe a new OP, I would like to gather here some interesting data that certainly open big possibilities.

    It is well known that a few special TFs can start important changes in the differentiation state of a cell. The best demonstration of that is the possibility of inducing a stem cell state from differentiated somatic cells, just by adding a few TFs, or similar molecules.

    The classic work by Yamanaka et al., in 2006, showed that human fibroblasts could be transormed into induced Pluripotent Stem Cells by adding 4 TFs, which started a process of dedifferentiation lasting a few weeks. The process has been improved in terms of efficienxy, and replicated with different combinations of TFs, miRNAs and other factors.

    So, we can understand form that that dedifferentiation (and therefore, probably, differentiation) is a very complex process, but that it can be started by some relatively simple initial “switch”, generated by a few important molecules, in particular TFs, that in that sense act as “master regulators”.

    But how do these “top acting” TFs implement their function?

    This recent paper gives interesting details about that:

    GRHL2-Dependent Enhancer Switching Maintains a Pluripotent Stem Cell Transcriptional Subnetwork after Exit from Naive Pluripotency.

    https://www.cell.com/cell-stem-cell/fulltext/S1934-5909(18)30287-X?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS193459091830287X%3Fshowall%3Dtrue

    Highlights:

    GRHL2 binds and activates enhancers during the transition from ESCs to EpiLCs
    GRHL2 maintains rather than activates target gene expression in EpiLCs
    GRHL2 target genes are regulated by distinct ESC-specific enhancers in ESCs
    GRHL2 loss results in an epithelial to mesenchymal-like transition in EpiLCs

    Summary:

    The enhancer landscape of pluripotent stem cells undergoes extensive reorganization during early mammalian development. The functions and mechanisms behind such reorganization, however, are unclear. Here, we show that the transcription factor GRHL2 is necessary and sufficient to activate an epithelial subset of enhancers as naive embryonic stem cells (ESCs) transition into formative epiblast-like cells (EpiLCs). Surprisingly, many GRHL2 target genes do not change in expression during the ESC-EpiLC transition. Instead, enhancers regulating these genes in ESCs diminish in activity in EpiLCs while GRHL2-dependent alternative enhancers become activated to maintain transcription. GRHL2 therefore assumes control over a subset of the naive network via enhancer switching to maintain expression of epithelial genes upon exit from naive pluripotency. These data evoke a model where the naive pluripotency network becomes partitioned into smaller, independent networks regulated by EpiLC-specific transcription factors, thereby priming cells for lineage specification.

    The paper is paywalled, but you can look at a clear graphical summary at the link I gave for the abstract.

    To make it simple:

    a) They have studied, in mouse cells vitro, the transition from a more embrional state (ESCs in the graphical abstract) to a slightly more differentiated state, corresponding to the epiblast EpiLCs in the graphical abstract), before the differentiation of the three primary germ layers.

    b) The interesting thing is that these two states, both of them very embrional, have huge epigenetic differences: studying just one activation signal, cohesin binding, they found that more thant 5000 genes were specially active in both kinds of cells, but only 2205 were common to the two states. That’s a big difference, for two cell types that are apparently very similar.

    c) A further analysis focused on GRHL2, a TF that seems to have an important role in this transition, for a specific subset of genes.

    d) The very interesting thing is that GRHL2 seems to be necessary and sufficient to induce a specific epigenetic transition for a very specific subset of genes.

    e) The effect of GRHL2 is to activate a whole set of enhancers that are inactive in ESCs and become active in EpiLCs.

    f) But the really surpisinf fact is that the genes regulated by this new set of enhancers were already expressed, at a similar level, in ESCs. IOWs, the activation of a completely new set of enhancers for those genes does not increase transcription of those genes in EpiLCs.

    g) The reason for that is that those genes were already transcribed in ESCs, but their transcrition was regulated by a different set of enhancers.

    h) So, the amazing conclusion is that GRHL2 contributes to the transition from ESCs to EpiLCs, both of them very high level stem cells, by changing the set of enhancers that regulated the transcription of a specific subset of genes, without changing the level of transcription of those genes.

    i) The authors believe, and in part demonstrate, that the meaning of such change in the set of active enhancers for the same subset of genes is to prepare the new cell (EpiLC) for further differentiation, that will take place only in the following phase, the differentiation of the three primary germ layers, specifically towards epitelial differentiation.

    That’s really hot stuff. This really shows that big epigenetic rearrangement precede visible differentiation and visible changes in transcription, and that specific discrete states with complex epigenetic rearrangements precede explicit transcription and differentiation states.

    This has the clear flavour of intelligent programming, of a process implemented in definite steps, each of them extremely purposeful and oriented to the final result.

  178. 178
    jawa says:

    Eugene S,

    I like your commentaries.

    Thanks.

  179. 179
    jawa says:

    gpuccio,
    Your post #177 is a real winner. Thanks.

  180. 180
    PeterA says:

    gpuccio’s post #177 is a real 3-pointer (basketball).

  181. 181
    OLV says:

    Ok, sticking to Peter’s basketball terminology, gpuccio’s post 177 is just another slam dunk among the many gpuccio has already done here.
    gpuccio’s team (DATCG, UB, Eugene S) is unbeatable.
    All their stubborn opponents can do safely is running for the hills.

  182. 182
    jawa says:

    OLV,

    I like the entire post 177, but specially where it hints at a future OP on the subject. Yeah!

  183. 183
    PeterA says:

    OLV and jawa,

    I agree with all you said, except putting pressure on gpuccio.
    We know he’ll write those excellent OPs when he can.
    We must respect that.

  184. 184
    OLV says:

    Peter,
    Don’t be so sensitive.
    Nobody is putting pressure on gpuccio to write the OP he has hinted at.
    jawa is simply highlighting gpuccio’s own hinting at it.
    Such a possibility is exciting indeed.

  185. 185
    jawa says:

    Peter,
    I agree with OLV.
    An OP by gpuccio on such a fascinating biology topic could be a paradisiacal meal for thoughts.

  186. 186
    jawa says:

    This paper says that “transcriptional regulation” is well studied?

    But according to gpuccio it hasn’t answered some important questions yet.

    Did they mean extensively studied?

    “Gene expression is determined through a combination of transcriptional and post-transcriptional regulation. While transcriptional regulation is well studied, less is known about how post-transcriptional events contribute to overall mRNA levels.”

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6086665/

  187. 187
    OLV says:

    jawa,
    Perhaps that’s what they meant.

  188. 188
    gpuccio says:

    jawa:

    What they probably mean is that we know a few things about transcriptional regulation, and much less about post-transcriptional regulation. Which is probably true.

    Both issues, however, are still black holes where we need to find a true direction.

    I would add translational regulation, and post translational regulation. And a few other things (some of them possibly not even imagined!). 🙂

  189. 189
    gpuccio says:

    OLV:

    It’s our team, and you are all part of it. 🙂

  190. 190
    gpuccio says:

    To all:

    OK, I agree: #177 is definitely interesting.

    But credit where credit is due: the paper is really very good! 🙂

  191. 191
    jawa says:

    gpuccio:
    Now I understand. Thanks.

  192. 192

    Hello ES,

    I have a special tag dedicated to GPuccio

    Who doesn’t?! 🙂

    A majority, if not all, of these people you mentioned who undoubtedly substantially contributed to an understanding of the semiotic core of life, were naturalists

    Yes, they were. I agree with GP’s sentiments in #172; he hit all the salient points and I don’t have much to add. The fact that the gene system requires a language structure is one of several critical observations.

    As for the question about naturalism, I would only add that what I’ve learned from these naturalists has nothing to do with their naturalism. First and foremost, I’ve learned important things about the gene system, which comes from good science and descriptions, not personal metaphysics. Their naturalism was irrelevant to the production of good descriptions, just as it should be, and I thank all of them.

  193. 193
    Eugene S says:

    UB

    “what I’ve learned from these naturalists has nothing to do with their naturalism”

    Exactly! What they produce is scientific models that are testable with an objective quality criterion named ‘practice’.

    Science is itself a product of design and stands on an assumption that, based on historical observations, an objective rational model can be constructed to predict future observations. And it really works, at least to a limit, and sometimes remarkably well. Why should science work at all? What is there in mathematics that enables it to describe the world so efficiently? Various prominent scientists did not feel ashamed to call this a miracle (Wigner, Feynman, Planck to name a few). The only rational answer to this can be that the world itself is a product of design.

  194. 194
    Eugene S says:

    OLV, jawa

    Thanks very much! I don’t think that giving links to Russian blogs will be of much interest for the English speaking audience. But anyway:

    mns2012.livejournal.com (personal blog)
    biosemiotics.livejournal.com (a series of notes and re-posts in support of the biosemiotic argument for ID)

  195. 195
    EugeneS says:

    Apologies for the broken links. One more time:

    one
    two

  196. 196
    OLV says:

    Eugene S:

    Thank you for the links!
    I know somebody who is fluent in Russian language and will like to read your blogs.

  197. 197
    gpuccio says:

    To all:

    The sequence suggested in Fig. 2 of the OP seems to be supported by recent papers, like the one quoted at comment #177.

    Here is another interesting one, which seems to show a similar connection between TFs, epigenetic marks and cell states.

    The Epigenetic Factor Landscape of Developing Neocortex Is Regulated by Transcription Factors Pax6 -> Tbr2 -> Tbr1

    https://www.frontiersin.org/articles/10.3389/fnins.2018.00571/full

    blockquote cite>Epigenetic factors (EFs) regulate multiple aspects of cerebral cortex development, including proliferation, differentiation, laminar fate, and regional identity. The same neurodevelopmental processes are also regulated by transcription factors (TFs), notably the Pax6? Tbr2? Tbr1 cascade expressed sequentially in radial glial progenitors (RGPs), intermediate progenitors, and postmitotic projection neurons, respectively. Here, we studied the EF landscape and its regulation in embryonic mouse neocortex. Microarray and in situ hybridization assays revealed that many EF genes are expressed in specific cortical cell types, such as intermediate progenitors, or in rostrocaudal gradients. Furthermore, many EF genes are directly bound and transcriptionally regulated by Pax6, Tbr2, or Tbr1, as determined by chromatin immunoprecipitation-sequencing and gene expression analysis of TF mutant cortices. Our analysis demonstrated that Pax6, Tbr2, and Tbr1 form a direct feedforward genetic cascade, with direct feedback repression. Results also revealed that each TF regulates multiple EF genes that control DNA methylation, histone marks, chromatin remodeling, and non-coding RNA. For example, Tbr1 activates Rybp and Auts2 to promote the formation of non-canonical Polycomb repressive complex 1 (PRC1). Also, Pax6, Tbr2, and Tbr1 collectively drive massive changes in the subunit isoform composition of BAF chromatin remodeling complexes during differentiation: for example, a novel switch from Bcl7c (Baf40c) to Bcl7a (Baf40a), the latter directly activated by Tbr2. Of 11 subunits predominantly in neuronal BAF, 7 were transcriptionally activated by Pax6, Tbr2, or Tbr1. Using EFs, Pax6? Tbr2? Tbr1 effect persistent changes of gene expression in cell lineages, to propagate features such as regional and laminar identity from progenitors to neurons.

    We are here in the mouse embrional neocortex, where 4 cell types can be defined, ino order of differentiation:

    1) Radial glial progenitors (RGPs)

    2) Intermediate progenitors a (aIPs)

    3) Intermediate progenitors b (bIPs)

    4) Postmitotic projection neurons (PNs)

    Now, as can be seen in Fig. 1 of the paper, the three transitions that lead from the stem cell (RPG) to the differentiated neuron (PN) are controlled by the sequential expression (cascade) of three TFs, master regulators of the process:

    Pax 6 -> Tbr2 -> Tbr1

    So again, we can see that the individual expression of one TF controls the transition from one state to another: master regulator TFs definitely act as powerful swithces.

    But again, if you read the paper, you can see that each of the three sequenctial TFs acts in a very specific and complex way on the epigenetic landscape of the cell. To assess that, the authors considered a group of specific epigenetic factor genes (EFs), as described.

    Of more than 350 EF genes evaluated, 52 exhibited cell-type-specific expression: 14 in RGPs, 2 in aIPs, 6 in bIPs, 9 in aIPs and bIPs, 18 in general neurons or precursors, and 3 in PNs or precursors (Supplementary Table S2). In addition, 11 EF genes exhibited rostrocaudal gradients: 4 high rostral, 7 high caudal (Supplementary Table S3). Furthermore, 36 EF genes were bound and regulated by Pax6, Tbr2, and/or Tbr1 (Supplementary Table S4). Of these, 9 were regulated by two TFs independently, but always in the same direction; and 2 EF genes were regulated only synergistically by Tbr2 and Tbr1. The effects of TFs on target gene expression were mixed: Pax6 activated 5 EF genes, and repressed 5; Tbr2 activated 8, and repressed 10; Tbr1 activated 13, and repressed 2; Tbr1 and Tbr2 (Tbr1/2) coordinately activated 2 EF genes. In sum, 73 EF genes showed cell-type or regional specificity, or were directly regulated by at least one of the TFs (Pax6, Tbr2, and Tbr1).

    The three TFs acted on those EFs in many different ways:

    a) By acting on N-methyltransferases, and therefore on DNA methylation and demethylation:

    “Pax6, Tbr2, and Tbr1 regulate this system by repressing and activating key genes, including repression of the caudal marker (Gadd45g) by Pax6 and Tbr2 (Figure 2F). Thus, DNA methylation and demethylation may regulate not only neuron differentiation (Sharma et al., 2016) and astrogenesis (Fan et al., 2005), but also cortical regionalization under the control of Pax6 and Tbr2.”

    b) By acting on histone marks:

    – Acetylation and deacetylation:

    “The present analysis identified several HATs and HDACs with cell-type-specific expression, and extensive regulation by Pax6, Tbr2, and Tbr1 (Figure 3). ”

    – Methylation and demethylation, through:

    — Trithorax/COMPASS Activating Complexes:

    “These results indicate that deposition and removal of TrxG marks are actively regulated by Tbr2 and Tbr1 during neuronal differentiation (Figure 4F)”

    — Polycomb Repressive Complex 1:

    “These data suggest that canonical PRC1 complexes are present in all types of cortical cells (although most abundant in progenitors), and are minimally regulated by Pax6? Tbr2? Tbr1. In contrast, non-canonical PRC1 complexes exhibit differentiation-related changes, such as upregulation of Rybp in IPs and new PNs. Notably, Tbr1 directly activated two non-canonical PRC1 subunits (Rybp, Auts2) implicated in brain development”

    c) By acting on ATP-Dependent Chromatin Remodeling Complexes, especially BAF Chromatin Remodeling Complexes.

    d) By acting on Non-coding RNA-Mediated Epigenetic Regulation:

    “Together, these findings indicate that several lncRNAs are specifically expressed at high levels in IPs and new PNs, and that several miR genes are expressed with cellular or regional specificity. The gradient of Mir99ahg, and its possible targeting Fgfr3, suggest a new role for miR in cortical patterning. Finally, their direct regulation by Tbr2 and Tbr1 suggests that lncRNA and miR genes have significant functions in cortical development (Figure 12G).”

    Table 1 sums up many of these EFs, and how they are regulated by the three TFs.

    Another important point is that the TF cascade controls not only the differentiation of inbdividual cells, but also their localization in the neocortex, their “regional identity”, which is of course as important to the generation of the final structur and function as the differentiation if individaul cells. The paper give interesting hints on that process too.

    So, there is this central cascade of three master proteins, and in many ways it controls the intermediate states of differentiation. But the effects of each of these master regulators are extremely complex, and work mainly thorugh specific and delicate regulations of epigenetic factors. A complex network of feedforward and feedback regulation guarantees that the process proceeds through the various ordered steps.

    Again, the transcriptome/proteome (for example, the expression of each of the three TFs in cancade) modifies the chromatin configuratio0n (the epigenetic landscape), determining some specific state. And the complex feefback of the epigenetic landscape to the transcriptome/proteome makes the process constantly dynamic, and determines the transition to a new state.

    The paper’s conclusions:

    Coordinate Regulation of Cortical Development by TFs and EFs
    The present study demonstrates that many types of EFs are direct targets of gene activation or repression by Pax6, Tbr2, or Tbr1 (Table 1). In many examples, the regulation of EFs by TFs was robust and affected multiple elements in an epigenetic system or signaling pathway. For example, Pax6, Tbr2, and Tbr1 activated multiple BAF subunit genes, to effect subunit switching and neuronal differentiation (Figure 10). In another example, Tbr1 activated non-canonical PRC1 subunits (Rybp, Auts2) in PNs (Figure 6). Also, many HATs and HDACs were regulated by this TF cascade (Figure 3). Overall, our results indicate that Pax6, Tbr2, and Tbr1 utilize EFs to modulate neurodevelopmental processes such as IP genesis, laminar fate acquisition, and regional identity (Figure 13). The Pax6? Tbr2? Tbr1 cascade itself emerges as a complex network with feedforward and feedback regulation (Figure 1B).

    Epigenetic mechanisms appear well-suited to regulation of regional and laminar identity, persistent phenotypes that are initially determined in progenitor cells, then propagated into IPs and finally, new PNs. For example, the cortical “protomap” is initially specified in RGPs, then propagated into IPs and PNs, where regional identity continues to be refined (Bedogni et al., 2010a; Elsen et al., 2013; Alfano et al., 2014).

    Besides EFs, other target genes regulated by Pax6, Tbr2, and Tbr1 can be identified using the same approach, and are currently under analysis. Through these studies, it will be possible to comprehensively profile gene expression by RGPs, IPs, and PNs; and to better understand how Pax6, Tbr2, and Tbr1 control the genesis of cortical PNs.

  198. 198
    OLV says:

    Where is George Castillo now?
    🙂

  199. 199
    OLV says:

    gpuccio:

    When they say:

    “Coordinate Regulation of Cortical Development by TFs and EFs”

    Does it mean that the coordinate regulation is done by TFs and EFs or by something else using TFs and EFs as important tools?

    For example, what determines how many
    TFs and EFs, when and where they should be there?

    Or are they always available anyway?

    Thanks.

  200. 200
    gpuccio says:

    OLV:

    You ask difficult questions! 🙂

    I think that the concept of coordinate regulation means that, at all times, the different levels of regulation act one on another, and the state transition is the global result of all those levels and of all those interactions.

    However, the last papers I quoted seem to show that some specific TFs, those that act as master regulators of some vast differentiation scenario, are important general switches that can activate or deactivate whole general procedures.

    In a sense, they are a regulation backbone that can define the complex procedure that will be active in that cell at some time.

    So, the develoment process could be in a way modular: a more high level thread would define the ordered expression of the master regulators, guided by specific information (more on that later). Then, at a lower (and more complex) level each specific “master regulator” scenario defines in detail what specific procedure will be implemented.

    How? We don’t really know, but as you have seen we know certainly more than, say, a few years ago.

    The most reasonable idea is that, while many of the details for each differentiation procedure are written at all regulation levels, certainly a very important role is reserved to the following interaction:

    TFs + enhancers

    We have seen at #177 that master regulators can act by changing the enhancer landscape. Indeed, those master regulators could be special TFs that can access specific chromatin sites even when they are not accessible, so that they can make them accessible to otehr TFs.

    Now, let’s reason a moment.

    If we really have about 1 million enhancers in the human genome (as many believe), even assuming that we want to choose a combination of 50 specific enhancers to define one high level differentiation landscape (which is a rather conservative hypothesis), the possible gross combinations are:

    3.283924e+235

    which is quite a number.

    It is quite reasonable that only very few of them make sense for an oredered cell development of some kind.

    So, again, we are in front of a huge problem of complex functional information, just to select a functional combination of elements from the search space of enhancers.

    Of course, the complexity increases exponentially if the number of requested enhancers is bigger: for 100 enhancers, the combinations are:

    1.066219E+442

    So the idea is, if you have 1 million different enhancers available, and 2000 TFs, and 60000 lncRNAs, and many other variables, you can really write a lot of specific procedures by intelligently manipulating them.

    Each enhancer can be long, in average, 1000 bp or more. There is a lot of search space to individualize them so that they can be important information tools.

    Of course, a lot of questions remain unsolved. For example:

    a) What guides the correct cascade of master regulators?

    b) What makes the procedures robust? They are very complex, so they are certainly subject to many possible errors.

    c) Are there other levels of control and regulation that we don’t know of, at present?

    The third answer is the easiest, so I will give it first: definitelyy yes. I am sure that there are many levels of regulation and control that, at present, we cannot even imagine.

    However, as science requires, we must reason with what we already know, otherwise we could become like our neo-darwinist friends! 🙂

    To try to answer, very partially, the first two questions, we must remember a very important thing, that I have tried to emphasize just at the beginning of this OP:

    The working information that is avalable in a cell at each specific time and in each specific state is always the sum total of all the information that is active at genetic and epigenetic level, IOWs the sum total of all the active information in any part of the cell at that moment.

    A cell never exists in a generic state, it never starts from scratch. The genome is never completely available, never completely blocked. The transcriptome is ever changing. The non coding RNAs landscape is ever changing. The proteome is ever changing.

    So, there is no such thing as “a cell”. There is always “a cell in one specific informational state at one specific time”.

    Life is a continuum of states, never an object.

    So, if we conventionally put a start at some place, we are just defining a conventional start in a continuum.

    For example, let’s say that we start from the zygote, as soon as it becomes one cell with its new genetic information.

    And, of course, with its specific epigenome that reads in a specific way that genetic information.

    So, the program that is active in the zygote is engineered so that that particular cell can proceed to its following states.

    Each program written in the dynamic cell is a specific selection of information that can guide that specific cell in that specific state to some new specific state. And so on.

    Of course, much of that must be written in the DNA sequence: protein genes and promoters and enhancers and non coding genes are all written there. But they can only work in the appropriate dynamic context, and nowhere else.

    The complexity of that all is overwhelming. Add to that that many factors come from outside the cell, from the “environment”.

    But that environment is, of course, part of the program, part of the engineering.

    It includes signals from other cells, or even signals from environmental niches. But those signals are functional, not random. They have been engineered, too.

    The program, with its complexity, could never work if the signal from the environment were random. That’s why the embryo requires a very protected and controlled environment.

    Of course, random noise can always happen: it does happen, and often it can destroy or deform the program. As we well know.

    But, in general, the program works very well. Because the procedures are robust. And the general control is robust.

    There is a lot of very, very good engineering there. A lot of extremely good Intelligent Design.

  201. 201
    DATCG says:

    Great comments, questions guys, papers and collection of information. Having a hard time keeping up, but continuing to follow for now.

    Upright @155, thanks for consolidated Gpuccio Post links 🙂

    GP @All – Duuuuuude 😉 EFs and TFs be BFFs – for Life! Haha 😉

  202. 202
    DATCG says:

    OLV @199,

    great questions 🙂 Who regulates the regulators?

    I see GPuccio has it covered at #200

  203. 203
    DATCG says:

    Gpuccio @200,

    What of an initial Zero State Zygote?

    Might we call it an Initialized State of Cell Being? Prior to launch so-to-speak of it’s free interaction with surrounding cells and environments?

    A Prescribed or Pre-loaded state? Influenced/Updated by ancestry and epigenetic factors, environment, etc., up the line prior to cell creation.

    So that each part of this cellular puzzle is prescribed to work together and vary according to environmental thresholds? Heat, cold, food limitations, rain, sun, disease, etc., etc.

    Essentially there are some well known thresholds to life. And none of them seem to be compatible with blind mutations, that may mutate any number of these thousands of epigenetic factors, TFs, PTMs, RNAs, etc., etc.

    And…

    But that environment is, of course, part of the program, part of the engineering.

    It includes signals from other cells, or even signals from environmental niches. But those signals are functional, not random. They have been engineered, too.

    But Gpuccio, playing Darwin’s advocate I thought all of this was a result of blind, unguided “processing” coming together while in the safe confines say, of duplicate genes?

    Where novel functions are built… to “coordinate” with other novel functions built by blind, unguided “processing” units like EFs and TFs 😉

    “Processing” – Can blind random events creating mutations blindly, be correctly termed a “process?”

    I guess, but isn’t that a bit of a stretch?

    A truly random, unguided process produces functional units that can interact with other functional units denovo?

    Thus when EvoLabs set up by Robert Marks and Dembski was created, they ran their experiments and found the concept of Conservation of Information was a key component of these so-called evolution models? The evolutionist were cheating, sneaking in information to the models to watch for and save.

    Thus the Evolution programs were building in key recognition pattens and/or plateaus reached by the program of functional units to “evolve” from each point to eventually count as “blind evolution” when in fact they were sneaking in Intelligent selection.

    Showing that Guided, intelligent evolution is the only way they could “recreate” “blind” evolution.

    Kinda makes you laugh at the circle of logic by blind evolutionist.

    An exercise in futility. Nothing was evolving through a blind process, it was compared to preconceived functional points, saved and repeated.

    Bunch of money spent to essentially do what Dawkins did at a higher cost.

  204. 204
    DATCG says:

    Oh my! So, I forgot to include the link to EvoLAbs, Robert Marks and crew. Do not think Dembski’s involved recently.

    But I went to get the link and look what I found at top of their page!

    http://www.evoinfo.org/index/

    A paper on unbounded evolution. Ha!


    “OBSERVATION OF UNBOUNDED NOVELTY IN EVOLUTIONARY
    ALGORITHMS IS UNKNOWABLE!”

    http://robertmarks.org/REPRINT.....ovelty.pdf

    Really love the work that Marks, Dembski, Ewert(Awesome work at Discovery now), and others have done.

    They’re shining a good amount of light on many Darwinist assumptions that fail when closely examined.

  205. 205
    DATCG says:

    Oh fun! 🙂

    Read a bit on the PreInitiation Paper Gpuccio, fascinating!

    I always enjoy reading a paper where scientist use new words to convey what is happening in a programming environment of assembled protein complexes.

    “Preinitiation” is not listed in
    https://www.merriam-webster.com/dictionary/preinitiation

    Essentially a programmed Complex(“PreInitiation” stage) is assembled and stationed for engagement awaiting all systems Go for Transcription service.

    It’s a normalization setup step prior to Transcript Initialization phase.

    It’s a Pre-processor step. It locates, sets up(denatures) and preps the site or pre-processes – not “preinitiates” which is not a word.

    Darwinist make this to hard.

    From a Design or programming perspective, it’s normal pre-processing techniques that allow flexibility in coding for multiple transcriptions and purposes. I suspect it’s modular for efficiency and different processing requirements.

    I expect there are several versions of these Pre-processor Units for Transcription which should be named appropriately.

    I may be wrong. I’ve not taken time to look at any other papers or information on the “PreInitiation Complex” pre-processor.

    But that’s essentially what it is and we should have tons of pre-processors, just like we have post-processors(Ubiquitin Post Translation for example).

    Pre and Post processing are uniquely Design concepts, not blind, certainly not random and must be coordinated with the application – Transcription for example.

    Darwinist are in real trouble, they’re just to blind to see it.

  206. 206
    gpuccio says:

    DATCG at #205:

    “From a Design or programming perspective, it’s normal pre-processing techniques that allow flexibility in coding for multiple transcriptions and purposes. I suspect it’s modular for efficiency and different processing requirements.”

    Well said. It certainly is.

    The preinitiation complex at the promoter is a strange thing. In a sense, it is the “universal” part of the transcription process: it is more or less the same for all transcriptions. And yet, it is extremely complex and flexible.

    The complex that is created at the enhancer is certainly the part that confers most of the specificity for each different transcription process, modulating its rate, speed and so on. Enhancers and specific TFs, with everything that helps them, are certainly a forest of engineered specificity.

    And yet, the preinitiation complex at the promoter is the tool that receives and interprets that tidal wave of specificity and meaning.

    And the mediator complex seems to be the main interface between the two poles.

    Fascinating, indeed! 🙂

  207. 207
    DATCG says:

    Gpuccio, you’re having to much fun 🙂

    It is fascinating, incredible design fo sure 😉

    So, quickly, this paragraph from the PIP paper you posted above in the OP…

    “Recent landmark studies on human and yeast PIC formation provided more differentiated views of the first steps in the transcription initiation process, corroborating the concept of stepwise assembly while also hinting at significant differences that may be present between the species [18], [19] (reviewed in Ref. [20]).

    I suspect the sub-units or sub modules(ie. subroutines) will vary across different species in the PIC-preprocessor? I’ve not had time to read the entire paper or review any others.

    Will try to check in later tonight or tomorrow. Great stuff again Gpuccio, thanks for all the work you do on these topics!

  208. 208
    DATCG says:

    One last comment…

    The PIP is “universal” but maybe a rough analogy is like a universal joint for cars and trucks. It comes in different flavors based upon different models(or in case of Humans and Yeast – species).

  209. 209
    OLV says:

    gpuccio:

    What you wrote in 200 is the material for a presentation at a conference. Very interesting.
    Thank you so much.

  210. 210
    gpuccio says:

    To all:

    Of course, a big question remains: how, in the course of evolution, are specific differentiation procedures acquired by organisms and genomes?

    One thing is certain: they are acquired by design, not certainly by the imaginary RV + NS mechanism postulated by neo-darwinists.

    But yet, the interesting question remains: how is design implemented? How are, for example, the complex and specific networks of enhancers that define different tissues and organisms written in the genomic sequence?

    Those who are familiar with what I have been writing here already know that I am strongly convinced that a special role in writing the procedures can be assigned to guided transposon activity. IOWs, transposons as design tools.

    So, I am happy to mention here a recent paper (June 2018) that is rather appropriate to support my old view:

    Transposable elements generate regulatory novelty in a tissue-specific fashion

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6006921/

    Abstract:

    Background:

    Transposable elements (TE) are an important source of evolutionary novelty in gene regulation. However, the mechanisms by which TEs contribute to gene expression are largely uncharacterized.

    Results:

    Here, we leverage Roadmap and GTEx data to investigate the association of TEs with active and repressed chromatin in 24 tissues. We find 112 human TE families enriched in active regions of the genome across tissues. Short Interspersed Nuclear Elements (SINEs) and DNA transposons are the most frequently enriched classes, while Long Terminal Repeat Retrotransposons (LTRs) are often enriched in a tissue-specific manner. We report across-tissue variability in TE enrichment in active regions. Genes with consistent expression across tissues are less likely to be associated with TE insertions. TE presence in repressed regions similarly follows tissue-specific patterns. Moreover, different TE classes correlate with different repressive marks: LTRs and Long Interspersed Nuclear Elements (LINEs) are overrepresented in regions marked by H3K9me3, while the other TEs are more likely to overlap regions with H3K27me3. Young TEs are typically enriched in repressed regions and depleted in active regions. We detect multiple instances of TEs that are enriched in tissue-specific active regulatory regions. Such TEs contain binding sites for transcription factors that are master regulators for the given tissue. These TEs are enriched in intronic enhancers, and their tissue-specific enrichment correlates with tissue-specific variations in the expression of the nearest genes.

    Conclusions:

    We provide an integrated overview of the contribution of TEs to human gene regulation. Expanding previous analyses, we demonstrate that TEs can potentially contribute to the turnover of regulatory sequences in a tissue-specific fashion.

    The paper is open access, so I invite anyone who is interested in this aspect to read it.

  211. 211
    jawa says:

    gpuccio,
    I like to see how you pull out all these recent papers out of nowhere (like a magician) to support your ID concepts.
    The comment you just wrote in 210 makes me wonder in awe.
    Thanks.

  212. 212
    PeterA says:

    jawa,
    Yes, I coincide with you on that. I enjoy seeing all these recent papers gpuccio pulls out of the magician’s hat so easily and cites them here for our delight.
    It’s interesting that leading-edge science research is providing all that material suitable for ID to be reaffirmed. It appears as purposely done.

  213. 213
    PeterA says:

    BTW, has somebody seen George Castillo around lately?

  214. 214
    DATCG says:

    Gpuccio @210,

    Nice lead-in paragraph of “Results.” Specifically my interest piqued with assignment of “intronic” regions.

    “We detect multiple instances of TEs that are enriched in tissue-specific active regulatory regions. Such TEs contain binding sites for transcription factors that are master regulators for the given tissue. These TEs are enriched in intronic enhancers, and their tissue-specific enrichment correlates with tissue-specific variations in the expression of the nearest genes.”

    These Introns and Intronic regions are turning out to be important components and not throw-away Junk as Darwinist once proposed.

    I’m also curious what your thoughts are on the “tissue-specific” connection of TEs/TFs and is this how you see it possible adding Macro changes by Design?

  215. 215
    DATCG says:

    Gpuccio,

    FYI, browsing one of your post linked by Upright Biped, found some broken links under the Graphs.

    https://uncommondescent.com/intelligent-design/the-amazing-level-of-engineering-in-the-transition-to-the-vertebrate-proteome-a-global-analysis/

  216. 216
    gpuccio says:

    DATCG:

    What links, exactly? They seem to be working to me…

  217. 217
    gpuccio says:

    DATCG at #214:

    Yes, the enrichment at introns seems specially interesting.

  218. 218
    gpuccio says:

    PeterA:

    “It’s interesting that leading-edge science research is providing all that material suitable for ID to be reaffirmed.”

    Yes. I do believe that ID will be vindicated, first of all, by mainstream research. IOWs, by facts.

  219. 219
    OLV says:

    Off-topic but still seems to confirms what gpuccio, UB and DATCG have been saying all along lately:

    Building the right centriole for each cell type

    The centriole is a multifunctional structure that organizes centrosomes and cilia and is important for cell signaling, cell cycle progression, polarity, and motility.

    The centriole is an evolutionarily conserved structure built from highly conserved proteins and is present in all branches of the eukaryotic tree of life. However, centriole number, size, and organization varies among different organisms and even cell types within a single organism, reflecting its cell type–specialized functions.

    It is gradually becoming clear that different modes of centriole formation observed across different organisms and different cell types of the same organism are controlled variations of the same centriole assembly blueprint. However, to fully understand centriole functions and how centrioles contribute to human diseases, it will be critical to understand the nuances of the molecular pathways that operate in physiological cellular contexts. Until then, because of the diversity in their number, structure, and function, centrioles will rightfully remain a central enigma in cell biology.

  220. 220
    gpuccio says:

    To all:

    Enhancers are still very elusive. the key to their specificity and specificity modulation is still quite a mystery, but some information has been gathered.

    It seems certain thhat the modulation of short motifs, usually consisting of a few nucleotides, can influence a lot the specificity and functions of anhancers in their relationship’s with TFs, even with the same TF.

    So, small sequence modifications can give outstanding regulation results, in a continuuom of possibilities.

    Here are two good papers about that issue:

    Dissection of thousands of cell type-specific enhancers identifies dinucleotide repeat motifs as general enhancer features

    https://genome.cshlp.org/content/24/7/1147.full

    Abstract:

    Gene expression is determined by genomic elements called enhancers, which contain short motifs bound by different transcription factors (TFs). However, how enhancer sequences and TF motifs relate to enhancer activity is unknown, and general sequence requirements for enhancers or comprehensive sets of important enhancer sequence elements have remained elusive. Here, we computationally dissect thousands of functional enhancer sequences from three different Drosophila cell lines. We find that the enhancers display distinct cis-regulatory sequence signatures, which are predictive of the enhancers’ cell type-specific or broad activities. These signatures contain transcription factor motifs and a novel class of enhancer sequence elements, dinucleotide repeat motifs (DRMs). DRMs are highly enriched in enhancers, particularly in enhancers that are broadly active across different cell types. We experimentally validate the importance of the identified TF motifs and DRMs for enhancer function and show that they can be sufficient to create an active enhancer de novo from a nonfunctional sequence. The function of DRMs as a novel class of general enhancer features that are also enriched in human regulatory regions might explain their implication in several diseases and provides important insights into gene regulation.

    And this more recent one:

    A massively parallel reporter assay reveals context-dependent activity of homeodomain binding sites in vivo.

    https://www.ncbi.nlm.nih.gov/pubmed/30158147

    Abstract:

    Cone-rod homeobox (CRX) is a paired-like homeodomain transcription factor (TF) and a master regulator of photoreceptor development in vertebrates. The in vitro DNA binding preferences of CRX have been described in detail, but the degree to which in vitro binding affinity is correlated with in vivo enhancer activity is not known. In addition, paired-class homeodomain TFs can bind DNA cooperatively as both homodimers and heterodimers at inverted TAAT half-sites separated by two or three nucleotides. This dimeric configuration is thought to mediate target specificity, but whether monomeric and dimeric sites encode distinct levels of activity is not known. Here, we used a massively parallel reporter assay to determine how local sequence context shapes the regulatory activity of CRX binding sites in mouse photoreceptors. We assayed inactivating mutations in >1,700 TF binding sites and found that dimeric CRX binding sites act as stronger enhancers than monomeric CRX binding sites. Furthermore, the activity of dimeric half-sites is cooperative, dependent on a strict three-base-pair spacing, and tuned by the identity of the spacer nucleotides. Saturating single-nucleotide mutagenesis of 195 CRX binding sites showed that, on average, changes in TF binding site affinity are correlated with changes in regulatory activity, but this relationship is obscured when considering mutations across multiple cis-regulatory elements (CREs). Taken together, these results demonstrate that the activity of CRX binding sites is highly dependent on sequence context, providing insight into photoreceptor gene regulation and illustrating functional principles of homeodomain binding sites that may be conserved in other cell types.

    Note how cooperative binding and three nucleotide specific spacings can influence a lot the modalities by which one specific TF, a master regulator, works. Regulating, as a whole, the very delicate process of photoreceptor development.

  221. 221
    OLV says:

    gpuccio,
    the second paper about the CRX seems to add intrigue to the plot. Definitely another area to dig in.
    Thanks.

  222. 222
    OLV says:

    This one seems related:
    CRX directs photoreceptor differentiation by accelerating chromatin remodeling at specific target sites

    CRX acts only at select, uniquely-coded binding sites to accelerate chromatin remodeling during photoreceptor differentiation.

    I bolded some text.

    I’m trying to find the mechanism to select the binding sites they act on.

  223. 223
    OLV says:

    Never mind. I had misunderstood it. Aren’t the binding sites determined by the coding?
    How is that code represented? Is that the domain structure of the protein? Isn’t this the same known biochemistry rule of chemical binding? Is that what they refer to as uniquely-coded? IOW, simple biochemistry?
    Please, can you explain this? Thanks.

  224. 224
    OLV says:

    is this related?

    Gene regulation underlies environmental adaptation in house mice

    Changes in cis-regulatory regions are thought to play a major role in the genetic basis of adaptation.

    cis-regulatory elements as essential loci of environmental adaptation in natural populations.

  225. 225
    gpuccio says:

    OLV:

    I don’t think that these issues are well understood.

    However, as can be seen in the papers I quoted, one of the components that determine the binding of some TF to specific enhancer sites is certainly the presence in those enhancer sites of specific motifs that are recognized by the TF.

    These motifs are usually short, a few nucleotides.

    However, the relationship between motif and binding is complex, and many other factors are involved. The paper quoted at #220 shows how small variations in the motif can cause differences in the binding. But motifs alone cannot explain all the specificity of the binding.

    With master regulators, the general idea seems to be that they are capable of recognizing their specific enhancer sites even if they are not accessible at the time. So, the master regulator can access those sites and make them active.

    So, just as an imaginary example, let’s say that at some time, as a result of something that happened before, a master regulator TF, let’s call it A, becomes highly expressed in a cell which is still scarcely differentiated.

    So, A can find a number of specific enhancers, let’s imagine there are 300 of them in the genome, whatever their state at that moment (accessible or not).

    So, A accesses those 300 enhancers, and binds to them. Probably, with different affinity and specificity, according to small differences in the sequences (motifs or else) in each enhancer, and maybe other factors. As we have seen for CRX, maybe one molecule of TF binds some enhancers, while a homodimer or a heterodimer ninds to others, with different specificity.

    IOWs, A creates a specific map of activated enhancers in the cell. Because it is a master regulator.

    That map is the foundation for what happens after that. The activated enhancers start their specific work. Each of them binds other TFs, or other factors, and binds to some specific promoter, creating specific loops and reorganizing chromatin structure.

    So, transcription changes: new mRNAs and new proteins are synthesized. New non coding RNAs too. The transcriptome/proteome changes, radically. The cell differentiates. Maybe it has, now, photoreceptors that did not exist before.

    Probably, many master regulators must act in sequence to give a final differentiated state. Each of them working with a lot of different subordinate TFs.

    OK, this is just a tentative scenario, but it seems to be consistent with what we know.

  226. 226
    OLV says:

    gpuccio,
    the tentative scenario you described seems quite complex.

  227. 227
    PeterA says:

    OLV,

    what you wrote is an understatement.

  228. 228
    jawa says:

    OLV,

    “seems quite complex”?

    are you kidding?

    It’s definitely very complex (functionally).

    No doubt about it.

    BTW, gpuccio’s description of such a tentative scenario is excellent as far as I can see.

  229. 229
  230. 230
  231. 231
    DATCG says:

    Gpuccio @216,

    Apologies, not links, but Graphs. It was just below Figure 1, with the sentence referencing the following Figure 2 graph.

    “Figure 2 shows a plot of the density distribution of human-conserved functional information in the various groups of organisms.”

    It’s resolved now as I clicked on the broken graph and it was giving me a Certificate error. I hit continue and now it’s resolved. It might be my browser. Both graphs work now.

    This is one of the graphs that initially was broken…

    https://www.uncommondescent.com/wp-content/uploads/2017/03/FigA.jpg

    I failed to copy down the “certificate error” but it’s fine now after I hit continue on it.

  232. 232
    DATCG says:

    Gpuccio @226,

    These are very interesting thoughts you listed in your example…

    So, just as an imaginary example, let’s say that at some time, as a result of something that happened before, a master regulator TF, let’s call it A, becomes highly expressed in a cell which is still scarcely differentiated.

    So, A can find a number of specific enhancers, let’s imagine there are 300 of them in the genome, whatever their state at that moment (accessible or not).

    So, A accesses those 300 enhancers, and binds to them. Probably, with different affinity and specificity, according to small differences in the sequences (motifs or else) in each enhancer, and maybe other factors. As we have seen for CRX, maybe one molecule of TF binds some enhancers, while a homodimer or a heterodimer ninds to others, with different specificity.

    IOWs, A creates a specific map of activated enhancers in the cell. Because it is a master regulator.


    That map is the foundation for what happens after that. The activated enhancers start their specific work. Each of them binds other TFs, or other factors, and binds to some specific promoter, creating specific loops and reorganizing chromatin structure.

    OK, that gives me a good picture of how you see these possible scenarios. But how do you think it’s guided? By surrounding environmental cues?

    So, transcription changes: new mRNAs and new proteins are synthesized. New non coding RNAs too. The transcriptome/proteome changes, radically. The cell differentiates. Maybe it has, now, photoreceptors that did not exist before.

    Awesome, but are the new photoreceptors a result of guided/directed conditions?

    Probably, many master regulators must act in sequence to give a final differentiated state. Each of them working with a lot of different subordinate TFs.

    That interaction of “many master regulators” acting in sequence means an ability to recognize the new photoceptors as legitimate additions and not foreign to the cell? Or, maybe a better way to say it, is the overall systems applications must recognize it as “safe” while being built as new novelty and not an attack or mutation to repair. Which leads me to ask how does the system know it’s valid?

    I’m trying to understand how these radical changes are permitted to survive.

    “OK, this is just a tentative scenario, but it seems to be consistent with what we know.”

    That’s a … for me that is, it’s a large area to cover. Do you mind giving a bit more detail on what is “consistent with what we know?” Are you describing fossil records, or molecular evolution, or both?

    Thanks Gpuccio! As always your post and comments always give me new information and ideas to ponder.

  233. 233
    OLV says:

    DATCG (232):

    I agree that gpuccio’s comment @ 225 is thought provoking.

    Glad you raised those questions. Thanks.

    This widens the territory to explore.

  234. 234
    OLV says:

    Peter and jawa,

    Yes, you’re right. I was too cautious in my statement. Thanks for the correction.

  235. 235
    PeterA says:

    DATCG,
    It seems like gpuccio’s OP and comments opened a can of worms.
    Too late now. 🙂

  236. 236
    jawa says:

    Where’s George Castillo when we need him?

    I’m sure he could provide a much simpler explanation than gpuccio’s “tentative scenario”

    🙂

  237. 237
    gpuccio says:

    To all:

    Guys, I suspected that my comments at #225 could evoke a few reactions! 🙂

    OK, I will try to answer your comments as well as I can. Of course, always consider that a lot of things are not yet understood. But we like to deal with these difficult questions, so let’s have fun! 🙂

  238. 238
    gpuccio says:

    To all:

    OLV at #226:

    the tentative scenario you described seems quite complex.

    PeterA at #227:

    what you wrote is an understatement.

    jawa at #228:

    “seems quite complex”?

    are you kidding?

    It’s definitely very complex (functionally).

    No doubt about it.

    OLV at #234:

    Yes, you’re right. I was too cautious in my statement. Thanks for the correction.

    Guys, your are definitely right!

    It is not only functionally complex. It is mind-boggingly functionally complex!

    Just think: my tentative scenario is an oversimplification of what could happen just for one process. And it it just the simple backbone of it.

    Each differentiation process involves probably a lot of different specific procedures (a cell does not need only photoreceptors to differentiate, but a lot of other things).

    And we have myryads of different differentiation pathways and states.

    And those differentiation pathways must be integrated in the general plan of tissues, organs, organisms.

    And the information for all that must be present, in some way, in the genome and epigenome of the original embryo.

    Now, let’s ignore for the moment the component in the dynamic epigenome. Let’s consider for a moment only the storage memory that is the DNA sequence. That information is potentially the same in all cells of an organism, with few exceptions.

    Now, our friend darwinists have lived more or less in the conviction that most of that information was stored in the protein coding genes. 20000 genes, 1.5% of the human genome.

    But we know very well, now, that that is not the case.

    Let’s consider just the enhancers.

    So, let’s say we have 1 million enhancers in the human genome.

    Each of them is a very specific depository of information. We have seem that even enancers which bind the same TF can have important differences that will condition what they do after having bound the master TF.

    So, let’s say that each of those 1 million enhancers has very specific information about one or more possible downstream procedures.

    That’s a lot of information, indeed!

    But the really important point is that this information is combinatorial. Those 1 million enhancers are used in groups, and build different pathways. Each of them can contribute to different procedures. The possibilities are, really, mind-boggling.

    And, of course, enhancers are only one component. One important component, but by far not the only one. Of course we have the genes themselves, both coding and non coding, and the promoters, the TFs, the non coding RNAs, the mediator complex, and so on.

    This is just to offer a few thoughts! 🙂

  239. 239
    gpuccio says:

    OLV at #229 and #230:

    Thank you for the links! 🙂

  240. 240
    gpuccio says:

    OLV at #229:

    Wow!

    Pioneering, chromatin remodeling, and epigenetic constraint in early T-cell gene regulation by SPI1 (PU.1)

    https://genome.cshlp.org/content/early/2018/08/31/gr.231423.117#aff-3

    Abstract:

    SPI1 (also known as PU.1) is a dominant but transient regulator in early T-cell precursors and a potent transcriptional controller of developmentally important pro-T cell genes. Before T-lineage commitment, open chromatin is frequently occupied by PU.1, and many PU.1 sites lose accessibility when PU.1 is later downregulated. Pioneering activity of PU.1 was tested in this developmentally dynamic context, by quantitating the relationships between PU.1 occupancy and site quality and accessibility as PU.1 levels naturally declined in pro-T cell development, and by using stage-specific gain and loss of function perturbations to relate binding to effects on target genes. PU.1 could bind closed genomic sites, but rapidly opened many of them, despite the absence of its frequent collaborator, CEBPA. RUNX motifs and RUNX1 binding were often linked to PU.1 at open sites, but highly expressed PU.1 could bind its sites without RUNX1. The dynamic properties of PU.1 engagements implied that PU.1 binding affinity and concentration determine its occupancy choices, but with quantitative tradeoffs for occupancy between site sequence quality and stage-dependent site accessibility in chromatin. At non-promoter sites PU.1 binding criteria were more stringent than at promoters, and PU.1 was also much more effective as a transcriptional regulator at non-promoter sites where local chromatin accessibility depended on the presence of PU.1. Notably, closed chromatin presented a qualitative barrier to occupancy by the PU.1 DNA binding domain alone. Thus, effective pioneering at closed chromatin sites also depends on requirements beyond site recognition served by non-DNA binding domains of PU.1.

    Emphasis mine.

    What can I say? Lots of confirmations here.

    Of course, T cell differentiation is one major scenario. We have to cover it as soon as possible.

    Again, the concept of a pioneering and transient master regulator is exceedingly intriguing!

  241. 241
    gpuccio says:

    DATCG at #231:

    I am happy you solved the problem! 🙂

  242. 242
    gpuccio says:

    DATCG at #232:

    Interesting questions, as usual!

    “OK, that gives me a good picture of how you see these possible scenarios. But how do you think it’s guided? By surrounding environmental cues?”

    In a sense, I believe that the scenario develops according to a pre-defined plan written in the global information present in the zygote (genetic and epigenetic, and whatever).

    We have evidence of that because a similar and functional scenario develops each time a new organism is born. Flies develop as flies, mice as mice, humans as humans. With all the possible individual variation, with all the possible errors, the program works very fine. Each single multicellular being on our planet is evidence of that.

    So, the program has the information to develop itself. Part of it is in the genome, both coding and non coding. Part of it is in the constantly changing epigenome, starting (conventionally) at the zygote.

    What about environmental cues?

    Well, I believe that contingent environmental factors can only act as background noise, variables that have to be factored and controlled by the program. For example, the embryo implantation in mammals and humans is subject to many unpredictable contingent variables (position, local conditions, and so on). The program must be able to manage random contingency, but it is not certainly guided by it.

    But there are environmental cues that are, instead, created by the program itself. Signals from local environment (the environment created by the program), or from other cells that are part of the program.

    IOWs, the program generates different parts and components, and those parts and components not only differentiate following their own procedures, but also are constantly exchanging signals, cues, correction, information, guidelines.

    It’s overwhelming.

    Cell-cell communication (for example, cytokines) is another major issue that I would like to discuss, sooner or later.

    You ask:

    “Awesome, but are the new photoreceptors a result of guided/directed conditions?”

    Well, I would say that everything that is complex and functional is the result of guided/directed conditions.

    Just a clarification. When I said:

    “The cell differentiates. Maybe it has, now, photoreceptors that did not exist before.”

    I was not referring to the evolution of a new system, only to the development of the system in a new cell type. The differentiated cell expresses structures that did not exist in the stem cell. This is devo, not evo.

    You say:

    “That interaction of “many master regulators” acting in sequence means an ability to recognize the new photoceptors as legitimate additions and not foreign to the cell? … I’m trying to understand how these radical changes are permitted to survive.”

    I may be wrong, but I think that you are discussing here the evolutionary origin of the changes. But that was not what I was saying.

    In the developmental program, of course, the new structures are “expected”, indeed, “wanted”. The program generates them, and creates the right environment for them.

    The evolutionary origin of all that is all another matter. Of course, it certainly requires a global re-engineering of everything, a wide range implementation. Gradual it could be, but always global in its perspectives and results.

    Regarding the “consistent with what we know”: I was just referring to the information in the recent papers I quoted about transcription regulation, and in many others that I did not quote for brevity. While my scenario is certainly tentative and extremely simplistic, it is based, however, on facts described in those papers.

  243. 243
    OLV says:

    gpuccio:

    I highly appreciate that you wrote such a good explanation.

    The big picture is getting a little clearer.

    Thanks.

  244. 244
    jawa says:

    OLV and Peter,

    Gracious gpuccio said that we were right but really we were off my a lot. He claims that his “tentative scenario” is an oversimplification of the real deal, which is not well understood yet. Also he elevated the complexity level to the mind-boggling category.

    As gpuccio himself has said this is fascinating indeed.

    I’m enjoying every minute of it.

  245. 245
    PeterA says:

    jawa,
    I totally agree with your comment.

    gpuccio’s comments @225 and 238 are real gems that must be read carefully, along with all his OPs and posts in this website. They are thought-provoking.

    gpuccio’s contributions to explaining the intelligent design concept are very valuable.

    Note that his discussion threads seem to get many visits from readers who don’t identify themselves. That’s interesting to me. I suspect some of those visitors disagree with him but don’t dare to jump into the discussion for lack of solid arguments that could withstand gpuccio’s well detailed explanations.

  246. 246
    jawa says:

    DATCG, Peter, OLV, UB, all,

    did you notice this statement at the end of #240?

    “Of course, T cell differentiation is one major scenario. We have to cover it as soon as possible.”

    The hint for another potential topic in a new OP by gpuccio makes me feel like a Pavlov’s drooling dog after hearing the sound of the coming food.

  247. 247
    PaoloV says:

    I see y’all are having a party here.
    Did you invite your oponent George Castillo to your party too?
    🙂

  248. 248
    PeterA says:

    Paolo,
    the invitation is open to all who like evidence-based science seriously. Some folks may not qualify though.

  249. 249
    jawa says:

    Peter @245:

    You missed listing 242 as gem too.

  250. 250
    jawa says:

    Paolo,
    That person doesn’t have much to celebrate here. 🙂

  251. 251
    gpuccio says:

    To all:

    This not too recent (2014) paper gives us a clear definition of “pioneering TF”:

    Pioneer transcription factors in cell reprogramming

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4265672/

    Abstract:

    A subset of eukaryotic transcription factors possesses the remarkable ability to reprogram one type of cell into another. The transcription factors that reprogram cell fate are invariably those that are crucial for the initial cell programming in embryonic development. To elicit cell programming or reprogramming, transcription factors must be able to engage genes that are developmentally silenced and inappropriate for expression in the original cell. Developmentally silenced genes are typically embedded in “closed” chromatin that is covered by nucleosomes and not hypersensitive to nuclease probes such as DNase I. Biochemical and genomic studies have shown that transcription factors with the highest reprogramming activity often have the special ability to engage their target sites on nucleosomal DNA, thus behaving as “pioneer factors” to initiate events in closed chromatin. Other reprogramming factors appear dependent on pioneer factors for engaging nucleosomes and closed chromatin. However, certain genomic domains in which nucleosomes are occluded by higher-order chromatin structures, such as in heterochromatin, are resistant to pioneer factor binding. Understanding the means by which pioneer factors can engage closed chromatin and how heterochromatin can prevent such binding promises to advance our ability to reprogram cell fates at will and is the topic of this review.

    So, the idea is that pioneering TFs are capable of engaging enhancers even if in inactive state, when the target DNA sequence is still “masked” by the nucleosome structure.

    However, higher order chromatin states may still be inaccessible even to pioneering TFs.

    This adds a new possible layer of regulation (in case we needed more of them ! 🙂 ).

    This very recent paper:

    Cryo-EM structure of the nucleosome containing the ALB1 enhancer DNA sequence

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5881032/

    deals with the pioneering task of understanding how pioneering TFs work at the structure level.

    Abstract
    Pioneer transcription factors specifically target their recognition DNA sequences within nucleosomes. FoxA is the pioneer transcription factor that binds to the ALB1 gene enhancer in liver precursor cells, and is required for liver differentiation in embryos. The ALB1 enhancer DNA sequence is reportedly incorporated into nucleosomes in cells, although the nucleosome structure containing the targeting sites for FoxA has not been clarified yet. In this study, we determined the nucleosome structure containing the ALB1 enhancer (N1) sequence, by cryogenic electron microscopy at 4.0 Å resolution. The nucleosome structure with the ALB1 enhancer DNA is not significantly different from the previously reported nucleosome structure with the Widom 601 DNA. Interestingly, in the nucleosomes, the ALB1 enhancer DNA contains local flexible regions, as compared to the Widom 601 DNA. Consistently, DNaseI treatments revealed that, in the nucleosome, the ALB1 enhancer (N1) DNA is more accessible than the Widom 601 sequence. The histones also associated less strongly with the ALB1 enhancer (N1) DNA than the Widom 601 DNA in the nucleosome. Therefore, the local histone-DNA contacts may be responsible for the enhanced DNA accessibility in the nucleosome with the ALB1 enhancer DNA.

    The pioneering TF here is FoxA, the enhancer ALB1, and the scenario is liver precursor cells, and liver differentiation in embryos. Again, a major developmental pathway.

    From the final conclusions:

    We found that the ALB1 enhancer (N1) DNA exhibited higher accessibility to DNaseI than the Widom 601 sequence in the nucleosome (figure 6), although the DNA-binding paths of both nucleosomes are not significantly different (figures 3?3–5). This higher accessibility suggests that the ALB1 enhancer (N1) DNA in the nucleosome may allow the efficient binding of pioneer TFs, such as FoxA, to the nucleosomal DNA. Interestingly, in the nucleosome, the ALB1 enhancer (N1) sequence contains flexible regions, and the DNA regions are located near a putative FoxA-binding region (figures 4 and ?and6).6). We also found that the histones are more weakly associated in the ALB1 nucleosome than in the Widom 601 nucleosome (figure 6). Therefore, the enhanced DNA accessibility and the weaker histone association found in the ALB1 nucleosome may be induced by the reduced local histone–DNA contacts. Unfortunately, the resolution of our cryo-EM structures (4.0–4.5 Å) is not high enough to clarify the detailed histone–DNA interactions in the ALB1 nucleosome. Further structural studies will be required to reveal the mechanism by which the association of the histones is weakened and how the ALB1 enhancer (N1) DNA sequence becomes accessible to pioneer TFs in the nucleosome.

    IOWs, there may be local condition, involving specific histone-DNA interactions, that make the nucleosome DNA “a little more accessible”, even if it is definitely inaccessible.

    Again, this adds a new possible layer of regulation (in case we needed more of them ! 🙂 ).

  252. 252
    gpuccio says:

    To all:

    A very recent review about the role of enhancers:

    Enhancer Logic and Mechanics in Development and Disease.

    https://www.ncbi.nlm.nih.gov/pubmed/29759817

    Abstract:

    Enhancers are distally located genomic cis-regulatory elements that integrate spatiotemporal cues to coordinate gene expression in a tissue-specific manner during metazoan development. Enhancer function depends on a combination of bound transcription factors and cofactors that regulate local chromatin structure, as well as on the topological interactions that are necessary for their activity. Numerous genome-wide studies concur that the vast majority of disease-associated variations occur within non-coding genomic sequences, in other words the ‘cis-regulome’, and this underscores their relevance for human health. Advances in DNA sequencing and genome-editing technologies have dramatically expanded our ability to identify enhancers and investigate their properties in vivo, revealing an extraordinary level of interconnectivity underlying cis-regulatory networks. We discuss here these recently developed methodologies, as well as emerging trends and remaining questions in the field of enhancer biology, and how perturbation of enhancer activities/functions results in enhanceropathies.

    Highlights:

    Metazoan development requires the orchestration of hundreds of thousands of enhancers to establish precise spatiotemporal gene expression patterns.
    Enhancers commonly exist in a ‘suboptimal’ state with respect to their transcription factor binding affinities, and this evolutionary ‘suboptimization’ of both the sequence and binding motif arrangement is key to encoding enhancer tissue-specificity.
    Accumulating evidence suggests that enhancers regulate gene transcription by stimulating release of promoter-paused RNA polymerase II into productive elongation.
    Bidirectional transcription of enhancer DNA is now appreciated to be a general characteristic of active enhancers, and recent reports document numerous examples of how promoters can function as enhancers to stimulate long-range gene activation. Thus, the distinction between enhancers and promoters is becoming less apparent.
    Clusters of cis-regulatory elements appear to be highly interconnected in the nucleus, and these complex regulatory ‘hubs’ are organized into topological domains along the linear chromosome.

    OK, emphasis is mine: but the rest was already there. 🙂

  253. 253

    #252

    Well that just screams “Randomness!” doesn’t it, GP.

  254. 254
    PeterA says:

    #251:

    “This adds a new possible layer of regulation (in case we needed more of them !)”

    “Again, this adds a new possible layer of regulation (in case we needed more of them !)”

    That’s the most concise botom line summary I’ve read in a long time. In this case with a refreshing hint of humor.

    Thanks.

  255. 255
    PeterA says:

    UB @ 252:

    Yeah, right. Unguided randomness, to be more precise.

    🙂

  256. 256
    jawa says:

    Peter,

    First, UB’s comment is @ 253 referring to gpuccio’s comment @ 252.

    Second, your statement is a tautology. Is there a guided randomness?

  257. 257
    gpuccio says:

    UB at #253:

    It certainly screams! 🙂

    It seems a little like the telephone game at parties. The original whispered message is: “Design!”. But, as the game goes on, the final words seem to constantly become: “Random variation + Natural selection, of course!”.

    So, everybody at the party is happy.

    But, as the whispers become loud screams, maybe someone is going to wonder what has been happening! 🙂

  258. 258
    gpuccio says:

    To all:

    Now, another important question is:

    How are all these complex structures really organized in the nucleus?

    This is pioneering work. Of course, we know much from techniques like Hi-C seq and similar, and we know about TADs, but the detailed topology in the nucleus? That is much more difficult.

    Let’s see. The whole human genome is linearly about 2-3 metres long. But it is packed and arranged in a space that is about 6 micrometres in diameter. That’s so amazing that we often forget it.

    But is the nucleus just an empty container for the genome?

    Not at all. It has definite structure, and a very complex one.

    Just have a quick look at the “cell nucleus” page on Wikipedia (is George Castillo around?). In particular, the 2.5 section: “Other subnuclear bodies”. We will come back to that.

    So, new techniques are being developed to study in more detail nuclear spacial organization.

    Here is one:

    Mapping 3D genome organization relative to nuclear compartments using TSA-Seq as a cytological ruler

    http://jcb.rupress.org/content.....07108.long

    Abstract
    While nuclear compartmentalization is an essential feature of three-dimensional genome organization, no genomic method exists for measuring chromosome distances to defined nuclear structures. In this study, we describe TSA-Seq, a new mapping method capable of providing a “cytological ruler” for estimating mean chromosomal distances from nuclear speckles genome-wide and for predicting several Mbp chromosome trajectories between nuclear compartments without sophisticated computational modeling. Ensemble-averaged results in K562 cells reveal a clear nuclear lamina to speckle axis correlated with a striking spatial gradient in genome activity. This gradient represents a convolution of multiple spatially separated nuclear domains including two types of transcription “hot zones.” Transcription hot zones protruding furthest into the nuclear interior and positioning deterministically very close to nuclear speckles have higher numbers of total genes, the most highly expressed genes, housekeeping genes, genes with low transcriptional pausing, and super-enhancers. Our results demonstrate the capability of TSA-Seq for genome-wide mapping of nuclear structure and suggest a new model for spatial organization of transcription and gene expression.

    And here is another one:

    Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus

    https://www.cell.com/cell/fulltext/S0092-8674(18)30636-6?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867418306366%3Fshowall%3Dtrue

    Highlights:

    SPRITE enables genome-wide mapping of higher-order interactions in the nucleus
    SPRITE uncovers two major inter-chromosomal hubs arranged around nuclear bodies
    3D distance of DNA regions to these hubs is based on their functional properties
    This organization constrains the overall 3D packaging of genomic DNA in the nucleus

    Summary:

    Eukaryotic genomes are packaged into a 3-dimensional structure in the nucleus. Current methods for studying genome-wide structure are based on proximity ligation. However, this approach can fail to detect known structures, such as interactions with nuclear bodies, because these DNA regions can be too far apart to directly ligate. Accordingly, our overall understanding of genome organization remains incomplete. Here, we develop split-pool recognition of interactions by tag extension (SPRITE), a method that enables genome-wide detection of higher-order interactions within the nucleus. Using SPRITE, we recapitulate known structures identified by proximity ligation and identify additional interactions occurring across larger distances, including two hubs of inter-chromosomal interactions that are arranged around the nucleolus and nuclear speckles. We show that a substantial fraction of the genome exhibits preferential organization relative to these nuclear bodies. Our results generate a global model whereby nuclear bodies act as inter-chromosomal hubs that shape the overall packaging of DNA in the nucleus.

    The two papers use two completely different techniques (named TSA-seq and SPRITE, respectively) to explore nuclear organization of chromosomes in relation to nuclear bodies, but they reach remarkably similar conclusions.

    In particular, they both agree on the importance of the nulceolus, and even more of nuclear speckles, for genome organization and transcription activity.

    Now, everybodies knows, more or less, what the nucleolus is.

    But what are nuclear speckles?

    Here is a paper of 2011 that gives some answers:

    Nuclear Speckles

    http://cshperspectives.cshlp.o.....00646.full

    Abstract
    Nuclear speckles, also known as interchromatin granule clusters, are nuclear domains enriched in pre-mRNA splicing factors, located in the interchromatin regions of the nucleoplasm of mammalian cells. When observed by immunofluorescence microscopy, they usually appear as 20–50 irregularly shaped structures that vary in size. Speckles are dynamic structures, and their constituents can exchange continuously with the nucleoplasm and other nuclear locations, including active transcription sites. Studies on the composition, structure, and dynamics of speckles have provided an important paradigm for understanding the functional organization of the nucleus and the dynamics of the gene expression machinery.

    And here is a more recent update:

    Nuclear speckles: molecular organization, biological function and role in disease

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737799/

    ABSTRACT:

    The nucleoplasm is not homogenous; it consists of many types of nuclear bodies, also known as nuclear domains or nuclear subcompartments. These self-organizing structures gather machinery involved in various nuclear activities. Nuclear speckles (NSs) or splicing speckles, also called interchromatin granule clusters, were discovered as sites for splicing factor storage and modification. Further studies on transcription and mRNA maturation and export revealed a more general role for splicing speckles in RNA metabolism. Here, we discuss the functional implications of the localization of numerous proteins crucial for epigenetic regulation, chromatin organization, DNA repair and RNA modification to nuclear speckles. We highlight recent advances suggesting that NSs facilitate integrated regulation of gene expression. In addition, we consider the influence of abundant regulatory and signaling proteins, i.e. protein kinases and proteins involved in protein ubiquitination, phosphoinositide signaling and nucleoskeletal organization, on pre-mRNA synthesis and maturation. While many of these regulatory proteins act within NSs, direct evidence for mRNA metabolism events occurring in NSs is still lacking. NSs contribute to numerous human diseases, including cancers and viral infections. In addition, recent data have demonstrated close relationships between these structures and the development of neurological disorders.

    In brief, these wonderfully complex nuclear structures, that contain almost no DNA, seem to be favourite sites for two old friends: the spliceosome and ubiquitination procedures! 🙂

    Now, if you look at the results section in the second paper quoted in this comment (the one using SPRITE technology), you can read:

    To explore these inter-chromosomal interactions, we built a graph connecting all 1-Mb regions in the mouse genome containing a significant pairwise interaction (p value < 1010) (Figure 3E). These interactions segregate into two discrete ‘‘hubs,’’ such that a large number of contacts occur within each hub, but no interactions occur between the two hubs. These hubs contain different functional properties: the first hub corresponds to gene-poor and therefore transcriptionally inactive regions, whereas the second hub corresponds to gene-dense regions that are highly transcribed by RNA polymerase II, enriched for active chromatin modifications, and contain other features of active transcription (Figures 3F and S3J; see the STAR Methods). Based on these properties, we refer to these hubs as the ‘‘inactive hub’’ and ‘‘active hub,’’ respectively.

    RNA-DNA SPRITE Reveals that the Inactive Interchromosomal Hub Is Organized around the Nucleolus.

    The Active Inter-chromosomal Hub Is Organized around
    Nuclear Speckles

    We considered that nuclear bodies might play an important role in defining the overall arrangement of genomic DNA in the nucleus because they organize large hubs of inter-chromosomal interactions.

    Regions that are closer to nuclear speckles are strongly associated with high levels of active Pol II transcription

    And so on…

    So, the plot definitely thickens! 🙂

    Believe me, I am not making all this up!

  259. 259
    gpuccio says:

    To all:

    Let’s go deeper into nuclear topology.

    This recent paper is about interaction between different chromosomes, and their newly discovered roles in transcription regulation. With a touch of romanticism:

    Interchromosomal interactions: A genomic love story of kissing chromosomes

    http://jcb.rupress.org/content......201806052

    Abstract:

    Nuclei require a precise three- and four-dimensional organization of DNA to establish cell-specific gene-expression programs. Underscoring the importance of DNA topology, alterations to the nuclear architecture can perturb gene expression and result in disease states. More recently, it has become clear that not only intrachromosomal interactions, but also interchromosomal interactions, a less studied feature of chromosomes, are required for proper physiological gene-expression programs. Here, we review recent studies with emerging insights into where and why cross-chromosomal communication is relevant. Specifically, we discuss how long noncoding RNAs (lncRNAs) and three-dimensional gene positioning are involved in genome organization and how low-throughput (live-cell imaging) and high-throughput (Hi-C and SPRITE) techniques contribute to understand the fundamental properties of interchromosomal interactions.

    The paper is open access, and it well deserves to be read.

    Here are just the subtitles of different sections:

    Principles of chromosomal structure and nuclear organization

    Kissing chromosomes: NHCCs

    NHCCs affect distinct transcriptional programs of biological pathways

    NHCCs and nuclear bodies

    Interchromosomal contacts between homologous chromosomes (transvection)

    Toward identifying NHCCs with molecular techniques

    Location matters for NHCCs in health and in disease

    LncRNAs are involved in the 3D organization of NHCCs

    Watching kissing chromosomes in real time: live-cell imaging of NHCCs

    Perspective

    Nuclear speckles are also mentioned. Specially interesting is the information about the formation of two very peculiar nuclear bodies:

    a) The nucleolus:

    ” In human nuclei, about 300 ribosomal genes located on five different acrocentric chromosomes (six in mouse) come into physical proximity to build the ribosomal preassembly in the nucleus (Fig. 1 B; Németh et al., 2010; Pliss et al., 2015; McStay, 2016). This spatial formation of the nucleolus is a conserved phenomenon and validates that nonhomologous chromosomes can intermingle in a nonrandom manner in all nuclei.”

    b) The olfactosome:

    “A structure equally as fascinating is the OR gene cluster, in which individual NHCCs allow the expression of single ORs in each cell to create a diverse repertoire of OR expression at the tissue level. At any given time, only a few of the about 1,400 OR genes located on 18 different chromosomes converge in the same interchromosomal space (Horta et al., 2018). The regulation of OR genes is orchestrated by binding of Ldb1, Lhx2, and Ebf transcription factors to highly similar transcription factor motifs of multiple enhancers on different chromosomes, thereby leading to nondeterministic mono-allelic OR gene expression (Lomvardas et al., 2006; Markenscoff-Papadimitriou et al., 2014; Monahan and Lomvardas, 2015; Monahan et al., 2017, 2018). Remarkably, the monogenic and mono-allelic gene expression of OR genes is explained by the spatial clustering of inactive genes to the same heterochromatic foci in the olfactosome (Fig. 1, C and D; Clowney et al., 2012). Recent in situ Hi-C experiments of FACS-sorted, differentiated olfactory sensory neurons determined that, at very large scales (i.e., 500-kb resolution), NHCCs between OR genes are highly specific and frequent, and that they consist of multiple different chromosomes to regulate selectively and specifically the transcription of each individual OR gene (Horta et al., 2018).”

  260. 260
    jawa says:

    Wow!
    How does gpuccio get all these interesting papers so easily?

  261. 261
    jawa says:

    “newly discovered roles” ???

  262. 262
    jawa says:

    “four-dimensional organization” ???

    huh?

  263. 263
    jawa says:

    DOI: 10.1083/jcb.201806052 | Published September 4, 2018

    Wow!

    This is as fresh from the oven as one can get.

  264. 264
    gpuccio says:

    jawa at #262:

    Four dimensional: including time.

    Human Genome’s Spirals, Loops and Globules Come into 4-D View

    https://www.scientificamerican.com/article/human-genome-s-spirals-loops-and-globules-come-into-4-d-view-video/

    Funny video here! 🙂

  265. 265
    OLV says:

    jawa,

    These papers are being published at a fast rate lately. Just look for them and you’ll find them. Obviously gpuccio knows what he’s looking for.

    As gpuccio has said in this discussion, there are many things still unknown or poorly understood at best. Therefore there’s plenty of room to still find newly discovered roles.

    4D organization perhaps refers to spatiotemporal synchronization or arrangement. “in the right place at the right time”

    Indeed this last paper gpuccio just posted is another gem in the growing collection within this thread.

  266. 266
    gpuccio says:

    To all:

    A collateral discussion strictly related to the issues debated here has started at this thread:

    https://uncommondescent.com/genetics/bee-genome-changes-dramatically-through-life/#comment-665055

    It could be interesting to consider the two discussions as part of a whole. 🙂

  267. 267
    gpuccio says:

    To all:

    There is another aspect we have yet to consider.

    Stem cells are known to exist in a delicate balance between two possible cell fates:

    a) Cell renewal

    b) Differentiation

    The ability to keep a balance between these two types of fate is the foundation to maintain stem cell compartments in the organism, while at the same time supporting the different differentiation cascades that derive from those compartments.

    Now, for each cell in the stem cell compartment the choice between the two fates, at each cell division, seems to be a remarkable balance of stochastic factors and control components. IOWs, the individual decision cannot easily be anticipated, but the behaviour of the whole compartment is strictly controlled.

    How is that obtained? I don’t think it is really understood. But, of course, different epigenetic transcription regulations contribute to that balance.

    That said, here is a very recent paper that connects, in that scenario, two important factors.

    One of them has been a major part of the discussions in this thread: histone modifications.

    But the second factor is more of a novelty in this context: alternative splicing.

    Here is the paper:

    Alternative splicing links histone modifications to stem cell fate decision.

    https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1512-3

    Abstract:

    BACKGROUND:
    Understanding the embryonic stem cell (ESC) fate decision between self-renewal and proper differentiation is important for developmental biology and regenerative medicine. Attention has focused on mechanisms involving histone modifications, alternative pre-messenger RNA splicing, and cell-cycle progression. However, their intricate interrelations and joint contributions to ESC fate decision remain unclear.

    RESULTS:
    We analyze the transcriptomes and epigenomes of human ESC and five types of differentiated cells. We identify thousands of alternatively spliced exons and reveal their development and lineage-dependent characterizations. Several histone modifications show dynamic changes in alternatively spliced exons and three are strongly associated with 52.8% of alternative splicing events upon hESC differentiation. The histone modification-associated alternatively spliced genes predominantly function in G2/M phases and ATM/ATR-mediated DNA damage response pathway for cell differentiation, whereas other alternatively spliced genes are enriched in the G1 phase and pathways for self-renewal. These results imply a potential epigenetic mechanism by which some histone modifications contribute to ESC fate decision through the regulation of alternative splicing in specific pathways and cell-cycle genes. Supported by experimental validations and extended datasets from Roadmap/ENCODE projects, we exemplify this mechanism by a cell-cycle-related transcription factor, PBX1, which regulates the pluripotency regulatory network by binding to NANOG. We suggest that the isoform switch from PBX1a to PBX1b links H3K36me3 to hESC fate determination through the PSIP1/SRSF1 adaptor, which results in the exon skipping of PBX1.

    CONCLUSION:
    We reveal the mechanism by which alternative splicing links histone modifications to stem cell fate decision.

    Well, I have blasted the two isoforms shown in Figure 5c of the paper, PBX 1a (430 AA) and PBX 1b (347 AA), that according to the paper seem to have such differentiated roles in stem cell fate.

    The N terminal part, the first 333 AAs, is completely identical.

    The C terminal part (97 AAs for PBX 1a, 14 AAs for PBX 1b) is completely different.

    That seems to make the whole difference in role.

  268. 268
    George Castillo says:

    Please excuse my absence, I completely forgot about our little conversation.

    Thank you for the plot, the correlation is higher than I expected but I’m not sure why you can’t use more of the 20 or so proteins in Durston’s paper.

    Anyways, have you plotted bits vs protein length as in Durston’s figure 2a for the human proteome using your method? You should do that with comparisons to each of the organisms from your evolutionary history plots.

    Also, for the D.C. Pluto art history plots, you should show shading or errors bars that represent variation in the data rather than just single points. It would make things much more believable to see that.

    Another interesting plot would be the evolutionary history plot for the human proteome, but grouped by size. Have you made any of these very basic plots to look for any biases in your method?

  269. 269
    George Castillo says:

    Please excuse my absence, I completely forgot about our little conversation.

    Thank you for the plot, the correlation is higher than I expected but I’m not sure why you can’t use more of the 20 or so proteins in Durston’s paper.

    Anyways, have you plotted bits vs protein length as in Durston’s figure 2a for the human proteome using your method? You should do that with comparisons to each of the organisms from your evolutionary history plots.

    Also, for the evolutionary history plots, you should show shading or errors bars that represent variation in the data rather than just single points. It would make things much more believable to see that.

    Another interesting plot would be the evolutionary history plot for the human proteome, but grouped by size. Have you made any of these very basic plots to look for any biases in your method?

  270. 270
    gpuccio says:

    George Castillo:

    Here are your answers:

    1) The protein families listed in Durston’s paper are 35. To many of them my method cannot be applied because they are not proteins present in the human proteome. I use human proteins as probes to measure functional complexity, therefore I can only do that with proteins present in the human proteome. Moreover, my database is restricted only to human verified proteins in Uniprot, that is the about 20000 reliable reference sequences identified in humans. So, proteins like Vif (Virion infectivity factor), Viral helicase1, Bac luciferase, SecY, DctM and many others in the list have no clear homologues in the human proteome.

    2) The relationship with length is very strong in my data as in Durston’s, as expected. I am adding two scatterplots for deuterostomia – not vertebrates and for cartilaginous fish, the two groups that are important for the computation of the jump in vertebrates, at the end of the OP. Consider that my values are given for about 20000 proteins.

    3) My evolutionary history plots represent the individual value for individual proteins. So, I do not understand what error bars you are referring to. If you want the distribution of the reference values for organism groups, I can give you the standard deviation values, even if I don’t understand what is their utility in this context. However, here they are:

    Cnidaria: mean 0.5432765 baa; sd 0.4024939 baa

    Cephalopoda: mean 0.5302676 baa; 0.3949502 baa

    Deuterostomia (not vertebrates): mean 0.6705278 baa; sd 0.4280898 baa

    Cartilaginous fish: mean 0.9491001; sd 0.5180335 baa

    Bony fish: mean 1.06373 baa; sd 0.4992876 baa

    Amphibians: mean 1.106878 baa; sd 0.509575 baa

    Crocodiles: mean 1.2175 baa; sd 0.5166932 baa

    Marsupialia: mean 1.354032 baa; sd 0.5016414 baa

    Afrotheria: mean 1.628872 baa; sd 0.43412 baa

    However, as explained, these are just standard deviations of the values for the whole human proteome as compared to each group of organisms. In no way are they “error bars”. Moreover, as you can certainly understand from the values of the standard deviations, the distributions here are certainly not normal.

    When comparing values for different groups of proteins, indeed, I always use non parametric methods, such as Wilcoxon test for independent groups. For examples, I have identified a group of 144 human proteins which are involved, according to Go functions, in neuronal differentiation. You may wonder if the jump from prevertebrate to vertebrate human conserved information is significantly higher in this group, as compared to all other human proteins.

    And it is. The median value in the neuronal differentiation group is 0.4534413 baa, as compared to the median value of 0.2629808 baa in the rest of human proteins. The difference is highly significant. p value, as computed by the Wilcoxon test, is 1.202e-12. I am adding the boxplot for that comparison at the end of the OP.

    This is just an example of how a correct analysis can be done using my values as applied to different protein groups.

    4) Not sure what you mean. I have already given the plots by size at point 2.

  271. 271
    gpuccio says:

    To all:

    Here is something more about pioneer TFs:

    Pioneer transcription factors shape the epigenetic landscape

    https://www.ncbi.nlm.nih.gov/pubmed/29507097

    Abstract:

    Pioneer transcription factors have the unique and important role of unmasking chromatin domains during development to allow the implementation of new cellular programs. Compared with those of other transcription factors, this activity implies that pioneer factors can recognize their target DNA sequences in so-called compacted or “closed” heterochromatin and can trigger remodeling of the adjoining chromatin landscape to provide accessibility to nonpioneer transcription factors. Recent studies identified several steps of pioneer action, namely rapid but weak initial binding to heterochromatin and stabilization of binding followed by chromatin opening and loss of cytosine-phosphate-guanine (CpG) methylation that provides epigenetic memory. Whereas CpG demethylation depends on replication, chromatin opening does not. In this Minireview, we highlight the unique properties of this transcription factor class and the challenges of understanding their mechanism of action.

    The first step in pioneer action is the initial binding (Fig. 2C) to permissive heterochromatin (Fig. 2B) and it appears to be rapid (eg less than 30 min. for Pax7, (45)). This is followed by a phase of binding stabilization (within 24h for Pax7) that may or may not be paralleled by nucleosomal changes that increase accessibility (31) and to appearance of low levels of the
    H3K4me1 mark in the center of target enhancers (Fig. 2D). These “Accessible” or “Primed” enhancers can undergo the final step of enhancer activation that involves the binding of other nonpioneer TFs, nucleosome depletion and deposition of the active enhancer mark H3K27ac that is
    associated with the histone acetylase activity of the general coactivator p300.

    Perspective:

    As exemplified in this review, the critical aspects of pioneer action are still the least understood. First and foremost, the molecular basis for pioneer access to their target DNA sequences in closed chromatin remains obscure. There may be more than one underlying mechanism as the mechanism proposed for FoxA interaction with nucleosomal DNA, namely its putative linker H1 mimicry binding interactions, does not seem to apply to other pioneers. The question thus remains whole for other pioneers and this highlights the fact that different pioneers may not only use different mechanisms but also may differ in their accessibility to various “flavors” of heterochromatin.

    Emphasis mine.

    Again, we see here highest specificity, probably through different functional mechanisms.

    And even heterochromatin, once believed to be functionally inert, seems to come in many different “flavors”. 🙂

  272. 272
    OLV says:

    gpuccio (266):

    Good suggestion. Thanks.

  273. 273
    OLV says:

    gpuccio (267):

    “The N terminal part, the first 333 AAs, is completely identical.
    The C terminal part (97 AAs for PBX 1a, 14 AAs for PBX 1b) is completely different.
    That seems to make the whole difference”

    Very convincing evidence. Thanks.

  274. 274
    OLV says:

    gpuccio (267):

    “The N terminal part, the first 333 AAs, is completely identical.
    The C terminal part (97 AAs for PBX 1a, 14 AAs for PBX 1b) is completely different.
    That seems to make the whole difference in role.”

    How many bits of new information is in that difference ?

  275. 275
    OLV says:

    gpuccio (267):

    “But the second factor is more of a novelty in this context: alternative splicing.”

    Indeed the plot continues to thicken. 🙂

    Interesting paper.

  276. 276
    OLV says:

    Glad to see George Castillo back in the discussion.

  277. 277
    OLV says:

    gpuccio (270):

    Though you’ve explained your analysis methodology several times, it’s always refreshing to see it again.

    Thanks

  278. 278
    OLV says:

    gpuccio (271):

    Really fascinating topic on pioneer TFs associated with epigenetic mechanisms that lead to high specificity.
    This screams conscious design, doesn’t it?
    Thanks.

  279. 279
    OLV says:

    DNA Methylation and Regulatory Elements during Chicken Germline Stem Cell Differentiation

    multiple epigenetic events, including DNA methylation, histone modifications, and non-coding RNAs, may act synergistically instead of single regulation mode during embryonic development, and this kind of regulation mode owns typical cell lineage specification.

  280. 280
    gpuccio says:

    To all:

    Of course we know that epigenetic regulations of transcription happen in time, but probably that specific aspect is often underemphasized. While many studies have, understandably, focused on transcriptional landscapes during differentiation, little is known of how epigenetic regulations of transcription change in differentiated cells in response to outer stimuli.

    IOWs, what I have called, in Fig. 2 in the OP, “Dynamic adaptation of cell in stable state”.

    But new facts are accumulating about that aspect too.

    The following paper is about the transcriptional response at the level of histone modifications in dendritic cell in the mouse after lipopolysaccharide (LPS) stimulation.

    Dendritic cells are important cells in the immune response, and LPS is a standard stimulator of the immune system.

    Here is the paper:

    Waves of chromatin modifications in mouse dendritic cells in response to LPS stimulation

    https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1524-z

    Abstract:

    Background
    The importance of transcription factors (TFs) and epigenetic modifications in the control of gene expression is widely accepted. However, causal relationships between changes in TF binding, histone modifications, and gene expression during the response to extracellular stimuli are not well understood. Here, we analyze the ordering of these events on a genome-wide scale in dendritic cells in response to lipopolysaccharide (LPS) stimulation.

    Results
    Using a ChIP-seq time series dataset, we find that the LPS-induced accumulation of different histone modifications follows clearly distinct patterns. Increases in H3K4me3 appear to coincide with transcriptional activation. In contrast, H3K9K14ac accumulates early after stimulation, and H3K36me3 at later time points. Integrative analysis with TF binding data reveals potential links between TF activation and dynamics in histone modifications. Especially, LPS-induced increases in H3K9K14ac and H3K4me3 are associated with binding by STAT1/2 and were severely impaired in Stat1?/? cells.

    Conclusions
    While the timing of short-term changes of some histone modifications coincides with changes in transcriptional activity, this is not the case for others. In the latter case, dynamics in modifications more likely reflect strict regulation by stimulus-induced TFs and their interactions with chromatin modifiers.

    Waves of chromatin modifications? Sounds interesting, doesn’t it?

  281. 281
    gpuccio says:

    To all:

    Speaking of cross talk. Both transcription regulation and ubiquitination have been the subject of my OPs. So, it’s beautiful to see them together:

    TRIM59 regulates autophagy through modulating both the transcription and the ubiquitination of BECN1

    https://www.tandfonline.com/doi/abs/10.1080/15548627.2018.1491493?journalCode=kaup20

    ABSTRACT:

    Macroautophagy/autophagy is a multistep cellular process that sequesters cytoplasmic components for lysosomal degradation. BECN1/Beclin1 is a central protein that assembles cofactors for the formation of a BECN1-PIK3C3-PIK3R4 complex to trigger the autophagy protein cascade. Discovering the regulators of BECN1 is important for understanding the mechanism of autophagy induction. Here, we demonstrate that TRIM59, a tripartite motif protein, plays an important role in autophagy regulation in non-small cell lung cancer (NSCLC). On the one hand, TRIM59 regulates the transcription of BECN1 through negatively modulating the NFKB pathway. On the other hand, TRIM59 regulates TRAF6 induced K63-linked ubiquitination of BECN1, thus affecting the formation of the BECN1-PIK3C3 complex. We further demonstrate that TRIM59 can mediate K48-linked ubiquitination of TRAF6 and promote the proteasomal degradation of TRAF6. Taken together, our findings reveal novel dual roles for TRIM59 in autophagy regulation by affecting both the transcription and the ubiquitination of BECN1.

    So, the same molecule does both things: it regulates transcription and it regulates ubiquitination.

    The function for TRIM59 at Uniprot is as follows:

    “May serve as a multifunctional regulator for innate immune signaling pathways.”

    Consistently, TRIM59 (as seen in humans) is almost absent in pre-vertebrates (0.258 baa) and has a definite, important jump in cartilaginous fish (+0.695 baa).

  282. 282
    OLV says:

    gpuccio @180:

    “Waves of chromatin modifications? Sounds interesting, doesn’t it?”

    Intriguing. The plot continues to thicken. For how long? 🙂

  283. 283
    OLV says:

    gpuccio @181:

    More plot thickening.

    Those multitasking proteins humble me.
    I hardly can do one task at a time.
    🙂

  284. 284
    OLV says:

    All these interesting papers gpuccio has been pulling out of his magic hat here lately add more encouraging news for the neo-Darwinian folks, for they seem to show the amazing capacity of RV+NS to produce all that marvelous machinery. 🙂

  285. 285
    DATCG says:

    Excellent paper references guys 🙂

    Gpuccio @281,

    the TRIM59 Dual purpose usage increases the chances of degradation and disease I assume if wrongly mutated?

    I know there’s much to know regarding all the sequences, but hmmmm, seems delicate from a regulatory view and must therefore be tightly controlled to reduce errors, to protect it.

    Am I on the correct track? The wrong mutation might have multiple dire consequences for regulatory requirements.

    The other question is, how many “multi-function” regulators exist?

    And how does a blind, unguided series of chances create a multi-functional protein that creates two transcript variations by alternative splicing?

    The coordination and timing must be precise, correct? If the alternative splice is wrong, if any of the domains incur mutations, then all could be stopped down, irrelevant, and marked for degradation by the UPS-Ubiquitin Proteaosome System or other degrading mechanisms.

    I’m uncertain if combinatorial factors increase for such a sharing of specificity and therefore limit unguided mutations as a necessary protection against change.

  286. 286
    OLV says:

    DATCG,

    Intriguing issues indeed.

  287. 287
    gpuccio says:

    To all:

    A special type of TFs are Nuclear receptors (NR). These are special TFs that are triggered by hormones, or hormone-like molecules, that arrive to the nucleus. The interaction with the specific NR activates the TF activity, and therefore the related cascade of transcription modifications. NRs include the receptors for thyroid hormones, steroid hormones and many other ligands.

    NRs are interesting because they have a rather constant structure, made essentially of three domains (and a few hinge regions):

    1) A ligand binding domain, usually C terminal.

    2) A DNA binding domain, like all TFs.

    3) An intrinsically disordered N terminal domain (NTD)

    The intrisically disordered region in the NTD is, of course, specially interesting. There are many evidences of its important funtional role, in many NRs.

    The following paper is a good example:

    Intrinsically disordered N-terminal domain of the Helicoverpa armigera Ultraspiracle stabilizes the dimeric form via a scorpion-like structure

    https://www.sciencedirect.com/science/article/pii/S0960076018301924?via%3Dihub

    Abstract
    Nuclear receptors (NRs) are a family of ligand-dependent transcription factors activated by lipophilic compounds. NRs share a common structure comprising three domains: a variable N-terminal domain (NTD), a highly conserved globular DNA-binding domain and a ligand-binding domain. There are numerous papers describing the molecular details of the latter two globular domains. However, very little is known about the structure-function relationship of the NTD, especially as an intrinsically disordered fragment of NRs that may influence the molecular properties and, in turn, the function of globular domains. Here, we investigated whether and how an intrinsically disordered NTD consisting of 58 amino acid residues affects the functions of the globular domains of the Ultraspiracle protein from Helicoverpa armigera (HaUsp). The role of the NTD was examined for two well-known and easily testable NR functions, i.e., interactions with specific DNA sequences and dimerization. Electrophoretic mobility shift assays showed that the intrinsically disordered NTD influences the interaction of HaUsp with specific DNA sequences, apparently by destabilization of HaUsp-DNA complexes. On the other hand, multi-angle light scattering and sedimentation velocity analytical ultracentrifugation revealed that the NTD acts as a structural element that stabilizes HaUsp homodimers. Molecular models based on small-angle X-ray scattering indicate that the intrinsically disordered NTD may exert its effects on the tested HaUsp functions by forming an unexpected scorpion-like structure, in which the NTD bends towards the ligand-binding domain in each subunit of the HaUsp homodimer. This structure may be crucial for specific NTD-dependent regulation of the functions of globular domains in NRs.

    I am sure that intrinsically disordered regions will be a source of great surprise, as research goes on.

    I am not sure of what “an unexpected scorpion-like structure” really is or looks like, but it definitely sounds intriguing! 🙂

  288. 288
    gpuccio says:

    To all:

    One of the points that I made just from the beginning of this OP is that the working information in any cell, including the zygote, is always the sum total of genetic + specific epigenetic information.

    The zygote inherhits most of its epigenetci information from the oocyte. IOWs, from the mother, another organism.

    Here is an interesting paper about the role that non coding RNAs, and in particular intron-derived non coding RNAs, derived from the maternal oocyte, can have in embryo development in Drosophila:

    Generation of Drosophila sisRNAs by Independent Transcription from Cognate Introns

    https://www.sciencedirect.com/science/article/pii/S2589004218300658?via%3Dihub

    Summary:

    Although stable intronic sequence RNAs (sisRNAs) are conserved in plants and animals, their functional significance is still unclear. We identify a pool of polyadenylated maternally deposited sisRNAs in Drosophila melanogaster. These sisRNAs can be generated by independent transcription from the cognate introns. The ovary-specific poly(A) polymerase Wispy mediates the polyadenylation of maternal sisRNAs and confers their stability as maternal transcripts. A developmentally regulated sisRNA sisR-3 represses the expression of a long noncoding RNA CR44148 and is required during development. Our results expand the pool of sisRNAs and suggest that sisRNAs perform regulatory functions during development in Drosophila.

  289. 289
    OLV says:

    gpuccio (288):

    There you go again with another interesting recent paper that seems pulled out of a magician’s hat. 🙂
    Thanks.

    “The identification of abundant polyadenylated maternal sisRNAs in Drosophila suggests that this paradigm may be more widely conserved than previously thought.”

  290. 290
    OLV says:

    gpuccio (287):

    “I am sure that intrinsically disordered regions will be a source of great surprise, as research goes on.”

    Well, the paper you cite is itself a source of jaw-dropping, eyebrow-raising information:

    “As a result of alternative splicing, different isoforms of NRs are generated, which are often characterized by different spatial and temporal distributions within various cells.”

    Indeed the plot continues to thicken at a fast pace.

    Thanks

  291. 291
    OLV says:

    Regarding the plot thickening, how long could it take?
    shouldn’t there be a time when the plot thickening process should start to slow down until it eventually stops? Are we approaching that point yet?

  292. 292
    gpuccio says:

    OLV:

    “shouldn’t there be a time when the plot thickening process should start to slow down until it eventually stops?”

    Maybe. Maybe not.

    “Are we approaching that point yet?”

    No.

  293. 293
    gpuccio says:

    To all:

    About OLV’s comment at #291, I want to point (again) to an old OP by our friend GilDodgen, a brief and clear argument that expresses a very deep truth:

    ID and the Trajectory of Observational Resolution

    https://uncommondescent.com/intelligent-design/id-and-the-trajectory-of-observational-resolution/

    If you have time, read it. Definitely. 🙂

  294. 294
    PeterA says:

    gpuccio,

    That old OP you suggested is a real gem, which remains as valid (or more) aswhen it was prophetically written.

    I read it all and wanted to thank you for calling my attention to it.

  295. 295
    PavelU says:

    gpuccio @292:

    Your answers don’t seem encouraging, except for those looking for long term job security in biology research.

  296. 296
    jawa says:

    PavelU,

    What’s discouraging in gpuccio’s comment?

  297. 297
    gpuccio says:

    To all:

    This open access paper, while not extremely recent (2015), is a very clear review about the role of enhancers.

    The selection and function of cell type-specific enhancers

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4517609/

    I quote a section that is very relevant to the discussions we had here about the role of pioneer TFs:

    Enhancer selection:

    The vast number of potential cis-regulatory elements in the genome and the cell-type selectivity with which they are utilized raises the question as to the series of events whereby unique enhancer repertoires are selected. Many lines of evidence indicate that enhancer selection is initially driven by so-called pioneer factors, exemplified by FOXA1, that are able to bind to their recognition motifs within the context of compacted chromatin31. By opening the conformation of the chromatin and initiating the process of enhancer selection, such pioneering factors can function as key cell lineage-determining transcription factors (LDTFs) to drive lineage-specific transcription programs. However, most sequence-specific transcription factors, including those that function as pioneer factors, recognize relatively short DNA sequences (of about 6 to 12 base pairs), and their typical DNA recognition motifs exhibit varying levels of degeneracy. This means that most sequence-specific transcription factors have millions of potential binding sites in the mammalian genome. Yet, chromatin immunoprecipitation followed by sequencing (ChIP-Seq) experiments have indicated that they bind only a small subset of all potential sites, and that a large fraction of the observed binding is associated with cell type-specific enhancers32. Cell type-specific binding sites often harbor motifs for additional pioneer factors, and experimental data strongly suggest that pioneer factors act in concert to jointly displace nucleosomes33, 34. Here, we review evidence supporting a model in which pioneer factors, or LDTFs, prime cell type-specific enhancers through collaborative interactions

    The idea that the function of pioneer TFs is based on collaborative interactions is very interesting, because we have seen that those TFs are the main “switches” that select a specific line of differentiation.

    That means that even pioneer TFs are not simple switches: they are, themselves, a network, a collaborative network, and therefore a level of important multiple regulation.

  298. 298
    OLV says:

    gpuccio (292,293):
    Thanks for answering my questions.

  299. 299
    OLV says:

    gpuccio (297):

    Very interesting paper. Thanks.

  300. 300
    gpuccio says:

    To all:

    Transcription factors and nucleosomes seem to be two major actors in the drama of transcription regulation, often competing for control of precious DNA motifs.

    The following recent paper gives us a glimpse of the complex dance between these two complex components:

    The interaction landscape between transcription factors and the nucleosome.

    https://www.ncbi.nlm.nih.gov/pubmed/30250250

    Abstract:

    Nucleosomes cover most of the genome and are thought to be displaced by transcription factors in regions that direct gene expression. However, the modes of interaction between transcription factors and nucleosomal DNA remain largely unknown. Here we systematically explore interactions between the nucleosome and 220 transcription factors representing diverse structural families. Consistent with earlier observations, we find that the majority of the studied transcription factors have less access to nucleosomal DNA than to free DNA. The motifs recovered from transcription factors bound to nucleosomal and free DNA are generally similar. However, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many transcription factors preferentially bind close to the end of nucleosomal DNA, or to periodic positions on the solvent-exposed side of the DNA. In addition, several transcription factors usually bind to nucleosomal DNA in a particular orientation. Some transcription factors specifically interact with DNA located at the dyad position at which only one DNA gyre is wound, whereas other transcription factors prefer sites spanning two DNA gyres and bind specifically to each of them. Our work reveals notable differences in the binding of transcription factors to free and nucleosomal DNA, and uncovers a diverse interaction landscape between transcription factors and the nucleosome.

  301. 301
    PavelU says:

    jawa @296:

    Did you understand the scientific implication of gpuccio’s short answers to OLV’s questions?

  302. 302
    gpuccio says:

    To all:

    For fans (like me) of transposons and of enhancers:

    Systematic perturbation of retroviral LTRs reveals widespread long-range effects on human gene regulation

    https://elifesciences.org/articles/35989

    Abstract
    Recent work suggests extensive adaptation of transposable elements (TEs) for host gene regulation. However, high numbers of integrations typical of TEs, coupled with sequence divergence within families, have made systematic interrogation of the regulatory contributions of TEs challenging. Here, we employ CARGO, our recent method for CRISPR gRNA multiplexing, to facilitate targeting of LTR5HS, an ape-specific class of HERVK (HML-2) LTRs that is active during early development and present in ~700 copies throughout the human genome. We combine CARGO with CRISPR activation or interference to, respectively, induce or silence LTR5HS en masse, and demonstrate that this system robustly targets the vast majority of LTR5HS insertions. Remarkably, activation/silencing of LTR5HS is associated with reciprocal up- and down-regulation of hundreds of human genes. These effects require the presence of retroviral sequences, but occur over long genomic distances, consistent with a pervasive function of LTR5HS elements as early embryonic enhancers in apes.

    Emphasis mine.

    And here is an article that comments on the previous one:

    Gene Expression: Transposons take remote control

    https://elifesciences.org/articles/40921

    Abstract:

    A family of retroviral-like elements in the human genome has a pervasive influence on gene expression.

    Barbara McClintock provided evidence of a potent mechanism in her seminal discovery of what she presciently dubbed ‘controlling elements’ – sequences of DNA that can move across the genome. Building on this, in the late 1960s Roy Britten and Eric Davidson proposed a model in which these elements – subsequently renamed transposons – could provide the raw material for complex regulatory networks (Britten and Davidson, 1969).

    Evidence in support of the Britten–Davidson model has grown steadily over the last decade (reviewed in Chuong et al., 2017). First, numerous examples of regulatory sequences derived from individual transposons have been documented in a variety of organisms. Furthermore, genomics has made it apparent that distinct suites of regulatory proteins bind to different transposon families. This binding allows groups of transposons to be activated en masse in certain cell types and during certain developmental stages. Now, in eLife, Daniel Fuentes, Tomek Swigut and Joanna Wysocka of Stanford University report that simultaneous perturbation of a family of retroviral-like transposons called LTR5HS produces profound transcriptional changes in human embryonic-like cells (Fuentes et al., 2018). These findings provide the strongest evidence thus far in support of the Britten–Davidson model as a genome-wide paradigm.

    Together, the results of Fuentes et al. suggest that in human embryonic-like cells, a potentially large subset of LTR5HS elements work as enhancers to control the activity of remote genes. However, it remains to be seen whether any of these regulatory activities have provided adaptive benefits during primate evolution. Intriguingly, many of the LTR5HS elements with enhancer activity are human-specific and some are not even fixed in the human population (Wildschutte et al., 2016). This raises the possibility that they contributed to recent adaptations. With CARGO in hand, the answers to these and other outstanding questions shall be delivered.

    This is very interesting indeed! 🙂

  303. 303
    OLV says:

    gpuccio (302):

    “For fans (like me) of transposons and of enhancers”

    You have effectively persuaded me to join this fans’ club too. 🙂

    The two papers you cited are very interesting.

    Here are a few quotes from the first paper in 302:

    …the transcriptional effects we observe upon deletion of single LTR5HS elements are surprisingly potent, suggesting that these elements indeed function as strong and/or relatively non-redundant enhancers of their target genes.

    Considering that other classes of TEs beyond LTR5HS are likely contributing to gene regulation in the early human embryo, these observations are consistent with a pervasive, rather than occasional, role of TEs in transcriptional control.

    …diverse mechanisms that may underlie regulatory functions in the early embryo.

    LTRs of retroviruses that successfully endogenized might have been optimized to begin with for directing expression in early embryo/germ cells.

    Interestingly, LTR5HS elements (but not related LTR5A/B elements) contain a consensus motif and are bound by the pluripotent stem cell/primordial germ cell/reprogramming factor and master regulator OCT4, which may have contributed both to their endogenization and cooption for enhancer function…

    OCT4 plays a central role in activating pluripotency network enhancers…

    It is intriguing to consider whether regulatory repurposing of LTR5HS elements for enhancer function may have contributed to human-specific transcriptome divergence and endowed the early developmental stages of the human embryo with species-specific attributes.

     —

    …a recent burst of HERVK endogenization supplied humans and other apes with new early embryonic enhancers, leading to a shift in preimplantation gene expression programs.

     
    emphasis added

    I would like to read your comments on the highlighted text, at your convenience. Specially the terms ‘cooption’, ‘repurposing’, ‘species-specific attributes’ and ‘recent burst’ that are used in some of the quoted statements. Thanks.

    BTW, the second paper in the same comment makes the plot even thicker. 🙂

  304. 304
    OLV says:

    gpuccio (300):

    The TF/nucleosome abstract gives a preview to what should be a very interesting article. Too bad it’s paywall.

  305. 305
    gpuccio says:

    OLV at #303:

    It’s very simple: such functional and complex and specific results of transposon activity in a very short evolutionary time are obviously explained only by design, IOWs guided transposon activity.

    As the true explanation cannot be accepted, our neo-darwinist friends must believe that unguided, random transposon activity can generate specific regulation networks in a few million years. Hence the various “cooptions”, “repurposings”, and similar.

    There is only one word for that: fairy tales. 🙂

  306. 306
    PeterA says:

    gpuccio,

    I totally agree with what you wrote @305. But it implies what we should humbly admit: that our neo-Darwinist friends have a much more prolific imagination than we could ever have. They have proven it beyond any doubt.
    🙂

  307. 307
    gpuccio says:

    To all:

    As said, higher order chromatin compartments are apparently somewhat stable.

    But it seems that this is not really the case.

    The following recent paper shows how even higher order levels of organization, like the A and B compartments and Topologically Associating Domains (TADs), are extremely dynamic and change a lot during specific pathways of differentiation:

    Genome-Wide Chromatin Structure Changes During Adipogenesis and Myogenesis

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6158721/

    Abstract:

    The recently developed high-throughput chromatin conformation capture (Hi-C) technology enables us to explore the spatial architecture of genomes, which is increasingly considered an important regulator of gene expression. To investigate the changes in three-dimensional (3D) chromatin structure and its mediated gene expression during adipogenesis and myogenesis, we comprehensively mapped 3D chromatin organization for four cell types (3T3-L1 pre-adipocytes, 3T3-L1-D adipocytes, C2C12 myoblasts, and C2C12-D myotubes). We demonstrate that the dynamic spatial genome architecture affected gene expression during cell differentiation. A considerable proportion (~22%) of the mouse genome underwent compartment A/B rearrangement during adipogenic and myogenic differentiation, and most (~80%) upregulated marker genes exhibited an active chromatin state with B to A switch or stable A compartment. More than half (65.4%-73.2%) of the topologically associating domains (TADs) are dynamic. The newly formed TAD and intensified local interactions in the Fabp gene cluster indicated more precise structural regulation of the expression of pro-differentiation genes during adipogenesis. About half (32.39%-59.04%) of the differential chromatin interactions (DCIs) during differentiation are promoter interactions, although these DCIs only account for a small proportion of genome-wide interactions (~9.67% in adipogenesis and ~4.24% in myogenesis). These differential promoter interactions were enriched with promoter-enhancer interactions (PEIs), which were mediated by typical adipogenic and myogenic transcription factors. Differential promoter interactions also included more differentially expressed genes than nonpromoter interactions. Our results provide a global view of dynamic chromatin interactions during adipogenesis and myogenesis and are a resource for studying long-range chromatin interactions mediating the expression of pro-differentiation genes.

  308. 308
    OLV says:

    gpuccio (305):

    Thanks for providing insightful comments on various “novel” terms encountered in biology research publications.

    gpuccio (307):

    Another very interesting recent paper. Thanks for citing it here and commenting on it too!

    The plot continues to thicken:

    Genome-Wide Chromatin Structure Changes During Adipogenesis and Myogenesis
     

    Self-organized TADs are basic units of chromatin folding, and tissue-invariant to a large extent, but can also be distinct in structure, function, and cell-to-cell variability in different organisms…

    …the spatial chromatin structure requires more elaborate TADs to achieve more precise gene regulation and avoid disordered transcription during cell differentiation.

    Chromatin loops formed by promoter interactions are the finest and most direct spatial structures that regulate gene expression.

    …the promoter interactions can partially induce cell differentiation by the activation of differentiation-related genes.

    Further study of promoter interactions (especially promoter-enhancer interactions) at higher resolution combined with epigenetic modification at various stages of differentiation can provide more accurate information about the mechanisms of adipogenesis and myogenesis.

    The dynamic spatial reorganization is consistent with gene expression modulation.

    Putative promoter interactions will provide evidence for further study of promoter-anchored chromatin loops to obtain a comprehensive understanding of the molecular regulatory mechanisms of adipogenesis and myogenesis.

    Emphasis added.

     No wonder that the anti-ID folks are so conspicuously absent from this discussion.

  309. 309
    OLV says:

    Dynamic regulation of transcription factors by nucleosome remodeling

    SWI/SNF can slide nucleosomes to displace neighboring TFs around the promoter region, providing a mechanistic basis for dynamically clearing both nucleosomes and other bound factors upon SWI/SNF recruitment

    Cis-regulatory determinants of MyoD function

    …numbers and the sequences of E-box motifs within MyoD-bound CREs have a significant functional effect on the enhancer chromatin.

    …three variables, namely motif sequences, their numbers and their spatial location within the CREs act as major determinants of the myogenic cis-regulatory code.

    …many transcription factors bind E-boxes specifically. For any such transcription factors, the question arises as to how motif sequence, their number and location within CREs result in binding specificity and affinity.

    Helicase promotes replication re-initiation from an RNA transcript

    To ensure accurate DNA replication, a replisome must effectively overcome numerous obstacles on its DNA substrate. After encountering an obstacle, a progressing replisome often aborts DNA synthesis but continues to unwind. However, little is known about how DNA synthesis is resumed downstream of an obstacle.

    Although the long-established role of replicative helicases is to catalyze strand separation, emerging evidence now supports the notion that their functions in replication are much broader…

    …T7 helicase participates in the assembly of the replication machinery at the fork and helps resolve replication conflicts with roadblocks on the DNA. Thus helicase has broad functionalities and unexpected roles in assuring processive DNA replication.

    Emphasis added.

    replisome?   huh?

    how many of these “*somes” are there in biology?

  310. 310
    OLV says:

    Nascent chromatin occupancy profiling reveals locus and factor specific chromatin maturation dynamics behind the DNA replication fork

    Proper regulation and maintenance of the epigenome is necessary to preserve genome function. However, in every cell division, the epigenetic state is disassembled and then re-assembled in the wake of the DNA replication fork. Chromatin restoration on nascent DNA is a complex and regulated process that includes nucleosome assembly and remodeling, deposition of histone variants, and the re-establishment of transcription factor binding.

     

    During development in higher eukaryotes, the DNA replication program is characterized by changes in the number of firing origins, the length of S-phase, and the timing of replication…

     

    The developmental plasticity in the DNA replication program may lead to promiscuous binding of regulatory factors through the chromatin changes that occur throughout this process, and thus contribute to epigenetic regulation and cell-type specific gene expression programs.

     
    Could it be that this promiscuity allows the powerful RV+NS to create novel complex functional specified information through the years?
    Any reasonable objection to this possibility?

  311. 311
    OLV says:

    The Chd1 chromatin remodeler can sense both entry and exit sides of the nucleosome

    Chromatin remodelers are essential for establishing and maintaining the placement of nucleosomes along genomic DNA. Yet how chromatin remodelers recognize and respond to distinct chromatin environments surrounding nucleosomes is poorly understood.

    …it is becoming increasingly clear that Chd1 and ISWI remodelers play specialized roles in reorganization of chromatin in vivo: yeast Isw1 and Chd1 generate nucleosome arrays with longer or shorter repeats, respectively (50), whereas Isw2 positions ‘founder nucleosomes’ against which these arrays pack (52). We look forward to future work that reveals how basic characteristics like the ones described here have been adapted for specialized cellular roles.

     
    The ATPase motor of the Chd1 chromatin remodeler stimulates DNA unwrapping from the nucleosome

    Chromatin remodelers are ATP-dependent motors that reorganize DNA packaging by disrupting canonical histone–DNA contacts within the nucleosome.

    …the Chd1 chromatin remodeler stimulates DNA unwrapping from the edge of the nucleosome in a nucleotide-dependent and DNA sequence-sensitive fashion.

    Determining whether DNA unwrapping by Chd1 contributes to transcription-related processes is an exciting new direction for future studies.

     
    Structure of the chromatin remodelling enzyme Chd1 bound to a ubiquitinylated nucleosome

    ATP-dependent chromatin remodelling proteins represent a diverse family of proteins that share ATPase domains that are adapted to regulate protein-DNA interactions.

    The path of DNA strands through the ATPase domains indicates the presence of contacts conserved with single strand translocases and additional contacts with both strands that are unique to Snf2 related proteins. The structure provides connectivity between rearrangement of ATPase lobes to a closed, nucleotide bound state and the sensing of linker DNA. Two turns of linker DNA are prised off the surface of the histone octamer as a result of Chd1 binding, and both the histone H3 tail and ubiquitin conjugated to lysine 120 are re-orientated towards the unravelled DNA. This indicates how changes to nucleosome structure can alter the way in which histone epitopes are presented.

    The Sequence of Nucleosomal DNA Modulates Sliding by the Chd1 Chromatin Remodeler

    Chromatin remodelers are ATP-dependent enzymes that are critical for reorganizing and repositioning nucleosomes in concert with many basic cellular processes. For the chromodomain helicase DNA-binding protein 1 (Chd1) remodeler, nucleosome sliding has been shown to depend on the DNA flanking the nucleosome, transcription factor binding at the nucleosome edge, and the presence of the histone H2A/H2B dimer on the entry side.

     

    …the sequence sensitivity of histones and remodelers occur at unique segments of DNA on the nucleosome, allowing them to work together or in opposition to determine nucleosome positions throughout the genome.

     
    The Latest Twists in Chromatin Remodeling

    In its most restrictive interpretation, the notion of chromatin remodeling refers to the action of chromatin-remodeling enzymes on nucleosomes with the aim of displacing and removing them from the chromatin fiber (the effective polymer formed by a DNA molecule and proteins). This local modification of the fiber structure can have consequences for the initiation and repression of the transcription process, and when the remodeling process spreads along the fiber, it also results in long-range effects essential for fiber condensation. There are three regulatory levels of relevance that can be distinguished for this process: the intrinsic sequence preference of the histone octamer, which rules the positioning of the nucleosome along the DNA, notably in relation to the genetic information coded in DNA; the recognition or selection of nucleosomal substrates by remodeling complexes; and, finally, the motor action on the nucleosome exerted by the chromatin remodeler. Recent work has been able to provide crucial insights at each of these three levels that add new twists to this exciting and unfinished story, which we highlight in this perspective.

     

    Emphasis added.

  312. 312
    jawa says:

    perhaps gpuccio has clearly answered this question before, but it’s not quite understood to me yet:

    We see a number of proteins involved in the transcription regulation. However, aren’t they synthesized by the same machinery they form part of? Perhaps they aren’t. Maybe they come from a simpler process that eventually evolved into this more sophisticated stuff we see now? What prompted such a change? How did that happen? Can somebody explain this? Am I missing something in the picture? Thanks.

  313. 313
    PeterA says:

    jawa,

    aren’t those questions a little off-topic here?

  314. 314
    ET says:

    jawa- It is all a catch-22 for the anti-IDists. It is an unbreakable loop. But our opponents still feel confident they can find a way to break the loop and find the origin of the cycle.

  315. 315
    PavelU says:

    ET,

    there is abundant literature explaining jawa’s questions.

    You should look for it yourself. Nobody has time to do it for you.

    Basically the RNA world, which has been proven beyond any doubt, provides the main answers to your question.

    You may want to start learning from this detailed explanation. It may get too technical for your level at some point, though.

  316. 316
    ET says:

    Hi PavelU- The RNA world is imaginary and most likely >99% chance of being pure BS.

    There isn’t any evidence for the RNA world, just a dire need

  317. 317
    jawa says:

    PavelU, are you serious?

    didn’t you present the same boring argument here ?

    Did you understand what Dr. Eugene S commented here about your irrelevant contribution?

  318. 318
    PavelU says:

    ET,

    did you watch the video? Did you understand the detailed explanation? Isn’t that sufficient to explain it all? Do you need more? What else?

  319. 319
    PavelU says:

    jawa,

    yes, I presented that argument there. It may be boring to you because you don’t understand it. You should start from Biology 101 before you engage in a discussion here.

    I did not know Eugene S is a Doctor. But anyway his comment was not accurate, because he claimed that the argument I presented is old, but that’s incorrect, because the given video was published very recently:

    David Baker
    Published on Jun 1, 2018
    The nature of complexity and what makes life different…

  320. 320
    ET says:

    Yes, I watched as much of that crap as I could stand. It was lacking science and evidence. The guy thinks that RNA self-replicates- it doesn’t.

    Two words refute that video and the RNA world- Spiegelman’s Monster

  321. 321
    PeterA says:

    PavelU,

    Are you an unconscious robot?

    How would you react to this simple statement:

    Wake up and smell the coffee!

    🙂

  322. 322
    es58 says:

    PavelU at 318: “did you watch the video?”
    Do you really consider hand waving evidence? or are you just mocking the video? It’s hard to tell.

  323. 323
    es58 says:

    OT to someone who can post (e.g.: News) : Please add some post about the Nobel in Chemistry about Evolution: e.g.:

    https://www.nytimes.com/2018/10/03/science/chemistry-nobel-prize.html

    Thanks

  324. 324
    PavelU says:

    PeterA,

    You’re not funny. Don’t quit your day job yet. SNL won’t hire you.

    I prefer the version about smelling flowers, rather than burnt coffee beans, which add more pollution to the environment, thus increasing the greenhouse effect that is causing the man-made global warming that is raising the sea level and will soon flood all the coastal populations.

  325. 325
    R J Sawyer says:

    es58@323. I agree. And being Canadian, and female, News should also post something about the Nobel prize for Phidias, shared by a female researcher from Waterloo. She is only the third woman to win the Nobel for physics. Certainly something to celebrate.

  326. 326
    PavelU says:

    es58,

    By “hand waving” do you mean what gpuccio does when he hides his convoluted arguments behind a mysterious concept of a “conscious” agent as the only possible source of what he calls “complex functionality” or something like that?

    Perhaps he still relies on David Chalmer’s outdated concept of “the hard problem of consciousnesses” which lately has been shown to be not so hard after all?

  327. 327
    jawa says:

    PavelU,

    I think you went too far this time. You wrote so much nonsense in one single comment that it’s a new record.

    gpuccio’s arguments are far from being convoluted. Many students would dream to have a professor who explains difficult concepts with such clarity as gpuccio does here.

    Definitely you should wake up and smell the flowers in the garden, if you prefer that to smelling freshly brewed coffee in the morning.

  328. 328
    PaoloV says:

    PavelU @326:

     Perhaps he still relies on David Chalmer’s outdated concept of “the hard problem of consciousnesses” which lately has been shown to be not so hard after all?

    Please, can you post links to support your claims? Thanks.

  329. 329
    PeterA says:

    es58,

    Thanks for posting that information about the Nobel Prize.

    Use of Evolution to Design Molecules Nets Nobel Prize in Chemistry for 3 Scientists

    for the directed evolution of enzymes

    “I always wanted to be a protein engineer,”

    “Proteins are marvelous molecular machines, tremendously complex… I wanted to be an engineer of the biological world.”

     
    Emphasis added

  330. 330
    OLV says:

    Wow! What has happened here?
    The discussion thread has gone completely off topic!
    Let’s take it back to serious stuff.

  331. 331
    OLV says:

    Major Determinants of Nucleosome Positioning

    The compact structure of the nucleosome limits DNA accessibility and inhibits the binding of most sequence-specific proteins.

    Nucleosomes are not randomly located on the DNA but positioned with respect to the DNA sequence, suggesting models in which critical binding sites are either exposed in the linker, resulting in activation, or buried inside a nucleosome, resulting in repression.

    The mechanisms determining nucleosome positioning are therefore of paramount importance for understanding gene regulation and other events that occur in chromatin, such as transcription, replication, and repair.

    Here, we review our current understanding of the major determinants of nucleosome positioning: DNA sequence, nonhistone DNA-binding proteins, chromatin-remodeling enzymes, and transcription.

    We outline the major challenges for the future: elucidating the precise mechanisms of chromatin opening and promoter activation, identifying the complexes that occupy promoters, and understanding the multiscale problem of chromatin fiber organization.

     

    Interdomain Communication of the Chd1 Chromatin Remodeler across the DNA Gyres of the Nucleosome

    Chromatin remodelers use a helicase-like ATPase motor to reposition and reorganize nucleosomes along genomic DNA. Yet, how the ATPase motor communicates with other remodeler domains in the context of the nucleosome has so far been elusive.

    Nucleosomes, the fundamental packaging unit of eukaryotic genomes, inherently restrict access to DNA…

    Nucleosome reorganization, which is tightly coupled to transcriptional regulation, is driven by a diverse class of ATP-utilizing enzymes called chromatin remodelers…

    Chromatin remodelers are complex multidomain machines that specifically reorganize nucleosome substrates, and a key question is how the core ATPase motor integrates information from auxiliary domains.

    …the chromo-ATPase and DBD from the same Chd1 molecule can bind to either the same or opposite DNA gyres, and thus different arrangements of a monomeric enzyme on the nucleosome likely explain the nucleosome sliding characteristics of Chd1.

    …the chromo-ATPase was shown to directly communicate with a DBD added in trans, indicating that interdomain interactions can occur between molecules that are not covalently linked …

     

  332. 332
    OLV says:

    Structural rearrangements of the histone octamer translocate DNA

    Nucleosomes, the basic unit of chromatin, package and regulate expression of eukaryotic genomes. Nucleosomes are highly dynamic and are remodeled with the help of ATP-dependent remodeling factors. Yet, the mechanism of DNA translocation around the histone octamer is poorly understood.

    …the nucleosome has considerable structural plasticity at its disposal and can adopt multiple conformations.

    Probably, other chromatin-modifying machineries might also exploit the intrinsic plasticity of the nucleosome for their functions.

  333. 333
    OLV says:

    A twist defect mechanism for ATP-dependent translocation of nucleosomal DNA

    As superfamily 2 (SF2)-type translocases, chromatin remodelers are expected to use an inchworm-type mechanism to walk along DNA. Yet how they move DNA around the histone core has not been clear.

    …such formation and elimination of twist defects underlie the mechanism of nucleosome sliding by CHD-, ISWI-, and SWI/SNF-type remodelers.

    DNA is shaped like a spiral staircase, twisting around itself to create a double helix. This results in a long string-like molecule that needs to be carefully packaged to fit inside the cells of organisms as diverse as fungi or humans. This packaging process starts when a portion of DNA tightly wraps around a spool-like core of proteins called histones. The resulting structure is known as a nucleosome. Like the beads on a necklace, nucleosomes exist at regular intervals along DNA.

    …the core mechanism for translocating nucleosomal DNA past the histone core arises from the ATPase motor enforcing unique geometries of DNA at its binding site…

    DNA translocation is initiated by the open form of the remodeler ATPase motor binding at SHL2, which draws ~1 bp into a bulge and results in DNA on the entry gyre of the nucleosome being shifted toward the remodeler by ~1 bp. Subsequent closure of the ATPase cleft upon ATP binding and hydrolysis forces a redistribution of the bulged DNA, with collapse of the bulge pushing ~1 bp onto the DNA segment toward the dyad, on the other side of the motor. In this manner, the segments of DNA on either side of SHL2 alternately shift toward and away from the motor, with transitions between open and closed forms of the ATPase effectively transferring ~1 bp from one DNA segment to the other in the process.

  334. 334
    OLV says:

    Cryo-EM of nucleosome core particle interactions in trans

    Nucleosomes, the basic unit of chromatin, are repetitively spaced along DNA and regulate genome expression and maintenance. The long linear chromatin molecule is extensively condensed to fit DNA inside the nucleus. How distant nucleosomes interact to build tertiary chromatin structure remains elusive.

    …chromatin tertiary structure is dynamic and allows access of various chromatin modifying machineries to nucleosomes.

    Although it was suggested that chromatin folds into the 30 nm fiber, recent data indicate that defined 30 nm structure might not exist in vivo.
    …nucleosomes form a less defined interdigitated polymer like structure in which nucleosomes have many different arrangements and conformations…

    How two distant nucleosomes interact in this polymer-like structure is not well understood.

  335. 335
    DATCG says:

    Gee… I have some catching up to do! Great stuff Gpuccio, once again! 🙂

Leave a Reply