Modifications to histones, including methylation and acetylation, are used by cells to regulate gene expression. Though a lot is now known about how different histone marks correlate with transcriptional activation or repression, the “histone code” has not yet been fully elucidated. As we discussed last week, a recent study found that, contrary to expectation, genes that are dynamically regulated during development do not display histone modifications normally associated with active transcription.
A new study published this week in Nature Genetics reports another unexpected epigenetic pattern. Tri-methylation of histone 3 at lysine 4 (H3K4me3), a mark associated with active transcription, is usually present as a sharp, narrow peak at the gene promoter. The authors of the study observed that some genes show a different pattern of H3K4me3: broad, low density methylation spanning up to 10kb along the gene body.
The broad H3K4me3 mark was associated with high gene expression levels and transcriptional stability in this study. The authors also found that cell identity genes and, interestingly, tumor suppressor genes, were enriched for the broad H3K4me3 mark.
Though it is unclear why tumor suppressors specifically would be associated with this mark, a comparison between normal and tumor cells showed that H3K4me3 peaks at tumor suppressor genes became narrower in cancer cells and that this was associated with transcriptional repression. Finally, the authors showed that candidate tumor suppressor genes could be identified by the broad H3K4me3 mark.
We asked one of the study’s lead authors, Wei Li, to tell us a little more about the study:
What was the motivation for your studies?
The general motivation was to make novel discoveries based on existing ‘big data’ in epigenomics. In order to do so, we have had to develop novel bioinformatic tools that will enable us to look at the data from a completely different angle. In particular in this study, we developed a new tool to quantify the H3K4me3 signal based on its width only. Most previous studies have only focused on its height or total signal, because the majority of genes (>95%) only have narrow (<1 kb) and high H3K4me3 peaks. This simple method has never been used in epigenomic data analysis before. We further proved that this computer-derived broad H3K4me3 signal alone is sufficient to define both known and novel tumor suppressors and its performance is even better than the human curated KEGG pathway in cancer (a collection of well-curated signaling networks involved in cancer development).
When you first observed broad H3K4me3 peaks, did you expect that it would be such a widespread feature of tumor suppressor genes?
No, it is totally unexpected. Many people in the field (including ourselves) observed broad H3K4me3 peaks long time ago (even in the first histone mark ChIP-seq paper published in 2007), but all ignored them and treated them as potential sequencing artifacts. My lab used the UCSC genome browser to check epigenetic patterns gene by gene on a daily basis, and we gradually noticed that broad H3K4me3 peaks are consistently observed in different datasets and specific to a small group of genes. To test whether it is an artifact or not, we decided to perform a functional enrichment analysis of genes marked with broad H3K4me3. If nothing is enriched, it must be a sequencing artifact. Interestingly, we found an unexpectedly strong enrichment in tumor suppressor genes.
Did you consider whether any other classes of genes were enriched in this histone mark?
We used an unbiased data-driven approach (rather than hypothesis driven) to study the genes marked with broad H3K4me3 peaks. It turns out that only cell identity genes and tumor suppressors are enriched. When we removed cell-type specific broad H3K4me3 peaks by epigenomic conservation analysis, tumor suppressors is the only class of genes that are enriched in the conserved broad H3K4me3.
Tumor suppressors are defined by their role in cancer. Why do you think they show a similar pattern of H3K4me3 in normal cells?
A common feature of tumor suppressors is that they are usually highly expressed in normal cells to prevent tumor formation. This is likely why they show a similar pattern of H3K4me3 because broad H3K4me3 is associated with increased transcription elongation and enhancer activity together leading to exceptionally high gene expression in normal cells.
Not all tumor suppressors show the broad H3K4me3 mark. Why do you think this is?
Cancer is always heterogeneous. To my knowledge, there is no single mechanism in the literature that can specifically explain all tumor suppressors. Broad H3K4med3 is not an exception.