Developmentally regulated genes break the rules

A new study published online this week in Nature Genetics reports that a certain class of genes, those with expression restricted to a specific developmental time point, follow a different set of rules than the rest of the genome.

The modifications to histones in promoter and enhancer regions are generally predictive of gene expression. For example, when a promoter is highly methylated at lysine 4 on histone H3 (H3K4me3), its associated gene is generally highly transcribed. Other marks may also be associated with activation, while different marks are associated with gene repression.

Developmentally regulated genes show similar H3K4me3 levels to silent genes, even though they are highly expressed during development.

Developmentally regulated genes show similar H3K4me3 levels to silent genes, even though they are highly expressed during development.{credit}Pérez-Lluch et al. Nat. Genet. doi: 10.1038/ng.3381{/credit}

SÍlvia Pérez-Lluch et al. examined the expression levels and histone modifications for all genes in the Drosophila modENCODE data set and identified a surprising pattern. Genes that were restricted in their expression to a specific developmental timepoint (called “developmentally regulated genes”) lacked epigenetic marks of active transcription, even when they were highly expressed. The authors confirmed the same pattern using modENCODE data for the netmatode C. elegans. 

Developmentally regulated genes  showed  expression levels during their actively transcribed period that were similar to those of  genes that are expressed stably throughout development. Another pattern identified by the authors was that strong histone marking is also associated with transcriptional stability. Comparable expression and chromatin modification data to that of the fly and worm aren’t yet available for mammals across multiple developmental timepoints. However, using data from ENCODE, the authors were able to show that mammalian cells showed a similar trend with regards to transcriptional stability.

We asked the lead authors of the study,  SÍlvia Pérez-Lluch, Montserrat Corominas and Roderic Guigo to give us a little insight into the history of this study and where they see this research going in the future:

When you began this study, what were your expectations? Did you expect to find that active chromatin marks were missing from so many actively transcribed genes?

We did not. Actually, our initial aim was not to investigate the relationship between chromatin marking and transcription, but the role of histone modifications in the regulation of splicing. We designed our initial experiments to compare levels of histone modifications in exons that were differentially included between Eye-antenna and Wing imaginal discs (EID and WID)—our hypothesis at that time being that the levels of some histone modifications would correlate with differential exon inclusion between these two tissues. But the results were quite frustrating, since we did find, in general, very low levels of marking in exons that were differentially included between WID and EID. This was initially very disappointing to us.  However, we also found, more generally, that many genes that were differentially expressed between WID and EID had also very low levels of a number of histone modifications typically associated to active transcription—even genes with very high expression levels. Since many such genes are likely to be regulated during development, this led us to hypothesize that lack of active histone modifications could be a general feature of developmentally regulated genes. This seemed an implausible hypothesis, going against the current models of the relationship between chromatin marking and transcription. Nevertheless, we turned to modENCODE data to further test it. The results were so strikingly consistent with our model that we “forgot” about our initial aim, and we focused our efforts instead into gathering additional supporting evidence. Understandably, our results were initially met with skepticism—the concern being that lack of chromatin marking could be a technical artifact derived from developmentally regulated genes having restricted expression patterns, and therefore making histone modifications difficult to detect using current technologies. Thus, a substantial amount of our work has been directed to address this concern.

Why do you think this pattern had not been observed before?

We are actually not the first to observe transcription with apparent lack of histone modifications. There have been a few reports of genes being transcribed in the absence of some histone modifications. Our main contribution is to show that this phenomenon is more widespread that generally assumed, and that it characterizes specifically genes that are regulated during development (at least in fly and worm). Why has this not been observed before? Mostly because data containing estimates of gene expression and histone modification along a sufficiently large number of developmental time points were not available before the modENCODE project. Then, we used a very simple, but effective measure to identify genes regulated during development, the coefficient of variation of gene expression. In summary, to make this observation you need both the data and the right approach to look at it

Your study showed that the link with transcriptional stability is also present in mammalian cells. If the association between chromatin marks and developmental regulation also holds in mammals, what, if any, do you think are the implications for biomedical research?

This is difficult to answer. Our initial results suggest that the model could be also applicable to mammals, but the data to test it are not yet available. Here we need to emphasize the importance of well-designed large-scale data production projects that monitor genome activity (transcription, chromatin structure, 3-dimensional genome organization, transcription factor binding, etc.) in a systematic and consistent way. We also want to emphasize that, at this point, our research is very basic. However, one could speculate that if our model holds in mammals, it could contribute to design better-informed approaches to manipulate/modulate expression levels of genes. Extrapolated to mammals, our results suggest that transcription factors play a comparatively more important role than histone modifications in the regulation of tissue specific genes. It has been shown that, in humans, tissue specific genes are more likely to be involved in diseases.

Are you able to speculate as to why developmentally regulated genes use a different epigenetic program compared to other genes?

What we call developmentally regulated genes correspond to genes with variable expression along time, which are often expressed only at a particular time point. Since development is a continuous process, one could speculate that rapid activation and de-activation of genes that are specific to a particular time point is more likely to occur without the need of modifying histone residues in chromatin.

What do you see as the most important next steps in this area?

Maybe the most important issue is to further challenge the model by investigating additional systems—in particular, mammalian systems—including differentiation processes, and additional histone modifications. The ultimate test of the model would come, however, from single-cell analysis, that is, from monitoring whether gene transcription does occur without histone modifications within the same cell. This is currently not possible given available technologies, but it may be feasible in the near future. It would be also important to investigate the role of distal enhancers, and of 3D chromatin structure, in the expression of developmentally regulated genes. Furthermore, we need to dig into the mechanism, by analyzing, for instance, how different classes of genes respond to perturbations of histone modification systems.

 

Epigenetic convergence in intellectual disability and cancer?

One of the many remarkable findings of the cancer genome sequencing projects that have been published in this and other journals is the repeated discovery of somatic driver mutations in genes that encode chromatin remodeling factors, which regulate the epigenome. De novo mutations in this same family of genes also cause several developmental syndromes, whose various features all include intellectual disability. Surprisingly, a few de novo mutations in these genes have recently been reported in autism. How do these mutations (which at least in some cases appear to be loss-of-function in both cancer and in the developmental syndromes) in the same class, and in some cases, the same exact genes cause these different diseases?

Much effort and attention is being paid toward developing drugs that target the epigenome and the proteins that regulate it. The $95M partnership between Constellation Pharmaceuticals and Genentech  and MIT’s Koch Institute’s recent symposium–Epigenetics, plasticity and cancer–are just two of many examples. But most intriguingly, can understanding the molecular mechanisms and risk factors of one of these diseases inform the biology and treatment of the others?

 

The epigenetic state of a genome has a vast influence over gene regulation, cellular activity and cell fate. Protein families that “write,” “erase,” and “read” the major histone marks (i.e. acetyl and methyl groups) ultimately regulate accessibility of the chromatin to transcriptional machinery.  (A useful review on epigenetic protein families and recent progress to pharmacologically modulate these proteins was published recently in Nature Reviews Drug Discovery here. )

 

New driver mutations in cancer

In the past few years, cancer sequencing efforts have identified driver mutations in genes that regulate histone and DNA modification in various types of cancers. In 2012, this journal published many cancer sequencing papers, four of which reported driver mutations in chromatin remodelers. While ARID1A mutations have been found in other cancers (e.g. ovarian carcinoma in Science  and the New England Journal of Medicine, Patrick Tan and colleagues reported here in April that ARID1A is mutated in 8% of gastric adenocarcinomas, and that frequent mutations in chromatin remodeling genes (ARID1A, MLL3 and MLL) were found in 47% of the gastric cancers they screened.

 

In May, Bin Tean Teh and colleagues reported inactivating mutations in MLL3 in 14.8% of cases of liver fluke-associated cholangiocarcinoma, a fatal cancer that occurs in the liver bile ducts that is common in parts of Southeast Asia infested with O. viverrini.  MLL3 encodes a histone-lysine N-methyltransferase and was already known to be mutated in several other cancer types. Teh and colleagues found that 75% of MLL3 mutations were likely loss-of-function, with mutation patterns reminiscent of tumor suppressor genes.

 

In June, Jessica Zucman-Rossi and colleagues reported that inactivating mutations in ARID1A and ARID2 were found in 16.8% and 5.6% respectively, of 125 cases of hepatocellular carcinoma.  ARID1A and ARID2 are SWI/SNF-related chromatin remodelers that control the accessibility of transcriptional machinery to promoter regions of DNA. Overall, 24% of HCC’s that they screened had at least one mutation in a chromatin remodeling gene (2 cases with SMARCA4 mutations, and single cases of many genes that encode chromatin remodelers, including SMARCA2, SMARCB1, SMARCA1, ARID4A, PBRM1, CHD3 and CHD4). Interestingly, ARID1A mutations were more often found in tumors related to alcohol intake compared to tumors related to other etiological sources (e.g. hepatitis B or C virus).

 

In July, Hidewaki Nakagawa and colleagues published another hepatocellular carcinoma sequencing study, which analyzed HCC tumors associated with hepatitis B or C virus infections. They initially sequenced 27 tumors and identified 2 frameshift and one missense mutation in ARID1A. Analysis of 120 more tumors identified 12 more mutations in ARID1A. Altogether, the authors found that 50% (14/27) of the tumors had recurrent mutations in genes that encode chromatin regulators. They also knocked down these chromatin regulators in a panel of 5 HCC cell lines, and found that knockdown of MLL3 led to increased cell proliferation. Knockdown of 11 genes, many of which are chromatin regulators, increased cell proliferation in at least one cell line. These results support the hypothesis that loss-of-function mutations in chromatin regulators promote cell growth in hepatocellular carcinoma.

 

De novo mutations in intellectual disability and autism

 

De novo mutations in this family of genes were recently reported in autism in Nature [see last paragraph of previous post on this blog entitled ‘Autism exomes arrive’ ], a pervasive developmental disorder that sometimes includes intellectual disability, and typically presents with cognitive and social dysfunction. In addition to the two cases of CHD8 mutations, single cases of de novo nonsense, missense and frameshift mutations were reported in those autism sequencing papers in ARID1B, CHD3, CHD7, MLL3, SETBP1 and SETD2.

 

As noted in that post, de novo mutations in this class of genes have also been found in recent years to cause various developmental syndromes, including mutations in MLL2 in Kabuki syndrome. This discovery was published in the landmark paper that was the first application of exome sequencing to define the cause of an autosomal dominant disorder.   This year, this journal published 5 other papers that report de novo mutations in genes that encode chromatin remodelers in developmental syndromes that vary in presentation but all include intellectual disability.

 

In April, Naomichi Matsumoto and colleagues reported that 20 of 23 individuals with Coffin-Siris syndrome (CSS) carried missense and truncating mutations in one of the six genes that encode SWI/SNF subunits, including SMARCB1SMARCA4, SMARCA2SMARCE1ARID1A and ARID1B.  CSS is a rare congenital syndrome (MIM 135900) that includes growth deficiency, intellectual disability, severe speech impairment, microcephaly, and coarse facial features. In all cases where parental samples were available, mutations occurred de novo. Notably, only one of the 23 cases presented with a hepatoblastoma. All of the mutations in ARID1A and ARID1B were truncating, and the authors suggest that haploinsufficiency causes CSS. At the same time, Gijs Santen and colleagues also reported truncating de novo mutations in ARID1B in CSS.

 

In the same issue, Joris Vermeesch and colleagues reported heterozygous mutations in SMARCA2 in Nicolaides-Baraitser syndrome (NBS, MIM 601358). The features of NBS include sparse hair, distinctive facial features, microcephaly, epilepsy and intellectual disability with marked language impairment. Altogether, the authors identified missense mutations in SMARCA2 in 36 of 44 individuals analyzed. In 15 of these patients, parental DNA was available and the mutations in each case were verified to occur de novo. None of the mutations were truncating, and the authors suggest that these mutations act in a dominant-negative or gain-of-function manner.

 

In June, Marcella Zollino and Bert de Vries and their respective colleagues independently reported mutations in the chromatin regulator KANSL1 in 17q21.31 microdeletion syndrome. KANSL1 is a subunit of a histone acetylatransferase (HAT) protein complex, and is required for its HAT activity. The mutations identified by both papers occur de novo and include nonsense and frameshift mutations.  Interestingly, a common feature of this microdeletion syndrome is a “happy, friendly disposition.”

 

Biological meaning? 

 

What is the biological meaning of these findings? Many of the somatic mutations in these chromatin remodeler proteins in cancer appear to be loss-of-function, but further in vivo and in vitro experiments will determine if these proteins act as tumor suppressors or oncogenes. It is also possible that different proteins can have either effect, depending on the type of cancer.  Mouse models of the developmental disorders noted here should bring further insights into how this class of proteins leads to these particular diseases.

 

Why does a loss-of-function somatic mutation in ARID1A cause so many types of cancer, and how can loss-of-function in this same gene also cause Coffin-Siris syndrome? How does loss of this class of proteins cause cancer in one context and intellectual disability, language impairment or autism in another? Is the sole case of a de novo frameshift indel mutation in ARID1B in autism relevant? If so, how does loss-of-function of ARID1B cause autism?  It is clear that much is still to be learned on how the epigenome regulates gene activity and how its misregulation in particular times and spaces can cause radically different severe diseases. Nevertheless, the most intriguing questions are these: 1) Where, and how, do these chromatin remodelers functionally overlap in these disparate diseases? And 2) Would a treatment for one disease be effective for another?