A piggyBac ride to pancreatic cancer genes

A cluster of pancreatic cancer cells. Scanning electron micrograph

A cluster of pancreatic cancer cells. Scanning electron micrograph{credit} Anne Weston, LRI, CRUK. https://wellcomeimages.org{/credit}

Pancreatic cancer is a highly heterogeneous disease that often has a poor prognosis. Development of drugs or treatment strategies to target cancers, including pancreatic cancer, depends on identifying the drivers of disease. These are the genes that promote carcinogenesis and coordinate development of the cancer. But by the time a patient is diagnosed, it can often be very difficult to tell which of the many mutations present in the tumor are actually disease drivers, and which are just along for the ride.

A new paper published in Nature Genetics describes a strategy for finding the genetic drivers in pancreatic cancer. The authors used a forward genetic screen in mice that targets a particular transposable element, the piggyBac transposon, to the pancreas. When the transposon inserts itself into the genome, it disrupts genes, causing mutations that may then lead to cancer. By using the screen in “sensitized” mice (i.e., mice with particular mutations that will accelerate disease progression), the authors were able to cause pancreatic tumors to form in the mice. The genetic changes in these tumors were then examined to identify which genes are most often targeted by the transposon.

Other studies have been published recently that use a similar approach to find drivers of other types of cancer. Neal Copeland, Nancy Jenkins and colleagues pioneered the use of Sleeping Beauty transposon mutagenesis to screen for genes important in cancer, including a recent study in liver cancer associated with hepatitis B. Rama Khokha and colleagues recently used the Sleeping Beauty mutagenesis method to identify driver genes responsible for the formation of sarcomas.

These screens have been very successful; there have even been Sleeping Beauty screens for pancreatic cancer driver genes (here and here). However, Roland Rad and colleagues found that a Sleeping Beauty transposon screen was not ideal for studying certain types of pancreatic cancer. In addition, Sleeping Beauty and piggyBac have different insertion preferences, so the tools complement one another. This means that, while some sets of genes identified with the two methods do overlap, there are other genes that can only be found by using one or the other methodImportantly, Dr. Rad and colleagues observed different histological subtypes of pancreatic cancer in mice when using piggyBac, which were not observed using Sleeping Beauty.

We asked Dr. Rad, one of the lead authors of the study, to tell us a little more about the paper.

For readers unfamiliar with insertional mutagenesis screens, could you tell us what a piggyBac transposon is and how it was discovered?

Transposons are mobile DNA segments that can move around the genome. They were first discovered by Barbara McClintock more than 50 years ago. The DNA transposon piggyBac encodes a transposase, which moves the transposon from one genomic locus to another by a cut-and-paste mechanism. Transposable elements, which have been widely used for genetic screening in bacteria, yeast, arthropodes and nematodes, had been inactivated during vertebrate evolution and were hence not available as genetic tools in higher organisms until recently. Successful efforts over the past ten years to make piggyBac work in mammalian cells motivated us to target it to the mouse genome and test its applicability for somatic mutagenesis in mice.

Lifecycle_of_the_Piggybac_Transposon_System

The PB transposase recognizes the specific inverted terminal repeats (ITRs) at each end of the transposon. PB then “cuts” the transposon out of its original location and moves it to a new, random location in the genome with a TTAA sequence. {credit}Transposagenbio via Wikimedia Commons{/credit}

How do screens like this (performed in mice) inform us about human cancer? What is the advantage of this approach over direct sequencing of patient tumors?

Genetic screening and cancer genome sequencing are highly complementary approaches. Sequencing and array-based analyses of patient tumors can very accurately identify all classes of somatic alterations in cancer. However, many of these changes are difficult to interpret. For example, hundreds or even thousands of genes are found to be transcriptionally or epigenetically dysregulated within a single patient´s tumor, meaning that they are not mutated but just being turned on or off. Pinpointing the few cancer-causing events among these large gene sets is extremely difficult. Likewise, copy number variation in cancer often affects large chromosomal segments, and for 75% of commonly amplified or deleted regions in human cancer, the cancer-causing genes have not yet been identified.

PiggyBac screening can tremendously facilitate this “search for the needle in the haystack” because transposons jump directly into the relevant genes. Even if a cancer gene is unequivocally identified through sequencing (for example based on its mutation), understanding downstream complexity can be difficult. Many cancer genes (e.g. methyltransferases, histone modifying enzymes, DNA repair genes) have large numbers of targets. Others (e.g. Ras) have many effector pathways that are used differently in various cancer types or have numerous interaction partners. Here again, unbiased genetic screening can identify ‘players’ at all levels of these cascades and can directly pinpoint important downstream effectors. Moreover, genetic screening provides a first level of biological validation of cancer genes and functional insights at an organismal level. These are some examples, which show that transposon-based screening can answer biological questions that cannot be systematically addressed by other approaches to cancer genome analysis.

What was the most surprising aspect of this study?

The screen produced numerous unexpected results. This is the beauty of a hypothesis-free forward genetic approach. We have discovered a large set of novel transcription factors involved in pancreatic cancer and shown that transposons can be used to identify cancer-relevant non-coding regulatory regions in the genome. The study also showed that insertional mutagenesis can induce different subtypes of pancreatic cancer and can dissect underlying genetic causes.

What was the biggest challenge your group faced during the course of the study?

The biggest challenge was to make the system work in mice. PiggyBac originates from Trichoplusia ni, the Cabbage moth. We modified PiggyBac and introduced it into the mouse genome. Naturally, we did not have a priori knowledge as to how the system would behave in the mouse. Will it be efficient enough to achieve transposition? How many transposons per cell will we need to achieve tumor induction in individual tissues? Do high transposon copies induce toxicity? How will the genetic elements (enhancers, gene trapping elements etc.) affect the phenotype? We addressed these questions by developing many different transposon mouse lines and systematically exploring PiggyBac’s characteristics in vivo.

How do you see your results being used in the future by other researchers or clinicians working with pancreatic cancer?

The study has produced rich biological insights and large sets of putative novel “players” in pancreatic cancer. Researchers will use this knowledge and take individual aspects further, e.g. perform in depth analysis of individual genes discovered in our screen or test whether they are targetable. Our genome-wide screen adds further pieces to pancreatic cancer´s “puzzle” in order to better understand the complexity of the biological processes driving tumorigenesis. We hope that this will ultimately help guide the development of novel therapeutic strategies.

You can find the paper describing this study here. More information about Dr. Rad and the piggyBac transposon system can be found here

Focus on TCGA Pan-Cancer Analysis

Nature Genetics is pleased to present today the first installment of our Focus on TCGA Pan-Cancer Analysis.

The Cancer Genome Atlas (TCGAhas analyzed over 8,000 cancer cases across 27 tumor types to date, and aim to have over 100,000 specimens analyzed by the of 2015. They have commendably made both data and exploration tools publicly available at https://www.cancergenome.nih.gov. They have previously published 8 papers reporting in-depth genomic characterization of individual tumor types.

The TCGA Pan-Cancer initiative, launched in October 2012 at meeting in Santa Cruz, California, seeks to combine analysis across tumor types in order to identify both similarities and differences in genomic alterations.  The work presented in this collection of Pan-Cancer publications includes analysis of the first 12 TCGA tumor types. This includes over 3,000 cancer patients profiled with 6 different platforms to assess genomic, transcriptional, epigenetic and proteomic alterations, combined with clinical data.  The authors demonstrate that while a majority of the tumor samples show unique genomic alterations, that by combining analysis they are able to both increase statistical power for the detection  of molecular drivers and to identify common pathways that are altered across tumor types.

The Pan-Cancer initiative provides a model for large-scale collaborative analysis as well as data sharing, bringing together over 250 collaborators from ~30 institutions working together on over 60 projects analyzing the same dataset.  These efforts required a strong collaborative framework, a commitment to rapid distribution of data, and means to facilitate shared analysis. Josh Stuart and colleagues provide an overview of this project in an accompanying Commentary.

This work also relied on the development of new bioinformatics tools and platforms, providing a foundation that should prove useful in future large-scale analysis projects. A Commentary by Larsson Omberg and colleagues highlights these approaches and the use of the Synapse software platform to share and evolve data, analysis and results among the Pan-Cancer Working Group. The Synapse platform was developed by Sage Bionetworks to facilitate open and data-driven collaborative research efforts, and is also being well used in DREAM challenges.  The use of this platform supported the discovery efforts reported in this collection of Pan-Cancer papers, which also provide a public resource of highly curated and standardized data sets across a series of data freezes along with automated analysis systems.

In the first of two Analysis papers published today in Nature Genetics, Chris Sander and colleagues provide a hierarchical classification of 3,299 tumors from 12 cancer types from the Pan-Cancer dataset, using a newly developed algorithmic approach. Their analysis separates tumors into those with primarily somatic mutations and those with primarily copy number alterations. They also identify oncogenic signatures that characterize ~30 tumor subclasses, which may suggest therapeutic targets of relevance across tumor types.

In a second Analysis published in Nature Genetics, Rameen Beroukhim and colleagues characterized somatic copy number alterations (SCNAs) in 11 cancer types and 4,934 primary cancer specimens from the Pan-Cancer dataset.  They observed whole-genome doubling in 37% of cancers, associated with higher rates of all SCNA.

We are pleased to support the TCGA Pan-Cancer efforts as a model for large-scale collaborative genomics projects combined with open data sharing, and demonstrating the ready benefits this can bring to our understanding of the molecular drivers of cancer.  The TCGA Pan-Cancer project continues to develop, and so will this Focus, so please get primed with this selection of publications and stay tuned.  In the meantime, here is a selection of social media and press stories: https://storify.com/obahcall/nature-genetics-pan-cancer-focus.

Epigenetic convergence in intellectual disability and cancer?

One of the many remarkable findings of the cancer genome sequencing projects that have been published in this and other journals is the repeated discovery of somatic driver mutations in genes that encode chromatin remodeling factors, which regulate the epigenome. De novo mutations in this same family of genes also cause several developmental syndromes, whose various features all include intellectual disability. Surprisingly, a few de novo mutations in these genes have recently been reported in autism. How do these mutations (which at least in some cases appear to be loss-of-function in both cancer and in the developmental syndromes) in the same class, and in some cases, the same exact genes cause these different diseases?

Much effort and attention is being paid toward developing drugs that target the epigenome and the proteins that regulate it. The $95M partnership between Constellation Pharmaceuticals and Genentech  and MIT’s Koch Institute’s recent symposium–Epigenetics, plasticity and cancer–are just two of many examples. But most intriguingly, can understanding the molecular mechanisms and risk factors of one of these diseases inform the biology and treatment of the others?

 

The epigenetic state of a genome has a vast influence over gene regulation, cellular activity and cell fate. Protein families that “write,” “erase,” and “read” the major histone marks (i.e. acetyl and methyl groups) ultimately regulate accessibility of the chromatin to transcriptional machinery.  (A useful review on epigenetic protein families and recent progress to pharmacologically modulate these proteins was published recently in Nature Reviews Drug Discovery here. )

 

New driver mutations in cancer

In the past few years, cancer sequencing efforts have identified driver mutations in genes that regulate histone and DNA modification in various types of cancers. In 2012, this journal published many cancer sequencing papers, four of which reported driver mutations in chromatin remodelers. While ARID1A mutations have been found in other cancers (e.g. ovarian carcinoma in Science  and the New England Journal of Medicine, Patrick Tan and colleagues reported here in April that ARID1A is mutated in 8% of gastric adenocarcinomas, and that frequent mutations in chromatin remodeling genes (ARID1A, MLL3 and MLL) were found in 47% of the gastric cancers they screened.

 

In May, Bin Tean Teh and colleagues reported inactivating mutations in MLL3 in 14.8% of cases of liver fluke-associated cholangiocarcinoma, a fatal cancer that occurs in the liver bile ducts that is common in parts of Southeast Asia infested with O. viverrini.  MLL3 encodes a histone-lysine N-methyltransferase and was already known to be mutated in several other cancer types. Teh and colleagues found that 75% of MLL3 mutations were likely loss-of-function, with mutation patterns reminiscent of tumor suppressor genes.

 

In June, Jessica Zucman-Rossi and colleagues reported that inactivating mutations in ARID1A and ARID2 were found in 16.8% and 5.6% respectively, of 125 cases of hepatocellular carcinoma.  ARID1A and ARID2 are SWI/SNF-related chromatin remodelers that control the accessibility of transcriptional machinery to promoter regions of DNA. Overall, 24% of HCC’s that they screened had at least one mutation in a chromatin remodeling gene (2 cases with SMARCA4 mutations, and single cases of many genes that encode chromatin remodelers, including SMARCA2, SMARCB1, SMARCA1, ARID4A, PBRM1, CHD3 and CHD4). Interestingly, ARID1A mutations were more often found in tumors related to alcohol intake compared to tumors related to other etiological sources (e.g. hepatitis B or C virus).

 

In July, Hidewaki Nakagawa and colleagues published another hepatocellular carcinoma sequencing study, which analyzed HCC tumors associated with hepatitis B or C virus infections. They initially sequenced 27 tumors and identified 2 frameshift and one missense mutation in ARID1A. Analysis of 120 more tumors identified 12 more mutations in ARID1A. Altogether, the authors found that 50% (14/27) of the tumors had recurrent mutations in genes that encode chromatin regulators. They also knocked down these chromatin regulators in a panel of 5 HCC cell lines, and found that knockdown of MLL3 led to increased cell proliferation. Knockdown of 11 genes, many of which are chromatin regulators, increased cell proliferation in at least one cell line. These results support the hypothesis that loss-of-function mutations in chromatin regulators promote cell growth in hepatocellular carcinoma.

 

De novo mutations in intellectual disability and autism

 

De novo mutations in this family of genes were recently reported in autism in Nature [see last paragraph of previous post on this blog entitled ‘Autism exomes arrive’ ], a pervasive developmental disorder that sometimes includes intellectual disability, and typically presents with cognitive and social dysfunction. In addition to the two cases of CHD8 mutations, single cases of de novo nonsense, missense and frameshift mutations were reported in those autism sequencing papers in ARID1B, CHD3, CHD7, MLL3, SETBP1 and SETD2.

 

As noted in that post, de novo mutations in this class of genes have also been found in recent years to cause various developmental syndromes, including mutations in MLL2 in Kabuki syndrome. This discovery was published in the landmark paper that was the first application of exome sequencing to define the cause of an autosomal dominant disorder.   This year, this journal published 5 other papers that report de novo mutations in genes that encode chromatin remodelers in developmental syndromes that vary in presentation but all include intellectual disability.

 

In April, Naomichi Matsumoto and colleagues reported that 20 of 23 individuals with Coffin-Siris syndrome (CSS) carried missense and truncating mutations in one of the six genes that encode SWI/SNF subunits, including SMARCB1SMARCA4, SMARCA2SMARCE1ARID1A and ARID1B.  CSS is a rare congenital syndrome (MIM 135900) that includes growth deficiency, intellectual disability, severe speech impairment, microcephaly, and coarse facial features. In all cases where parental samples were available, mutations occurred de novo. Notably, only one of the 23 cases presented with a hepatoblastoma. All of the mutations in ARID1A and ARID1B were truncating, and the authors suggest that haploinsufficiency causes CSS. At the same time, Gijs Santen and colleagues also reported truncating de novo mutations in ARID1B in CSS.

 

In the same issue, Joris Vermeesch and colleagues reported heterozygous mutations in SMARCA2 in Nicolaides-Baraitser syndrome (NBS, MIM 601358). The features of NBS include sparse hair, distinctive facial features, microcephaly, epilepsy and intellectual disability with marked language impairment. Altogether, the authors identified missense mutations in SMARCA2 in 36 of 44 individuals analyzed. In 15 of these patients, parental DNA was available and the mutations in each case were verified to occur de novo. None of the mutations were truncating, and the authors suggest that these mutations act in a dominant-negative or gain-of-function manner.

 

In June, Marcella Zollino and Bert de Vries and their respective colleagues independently reported mutations in the chromatin regulator KANSL1 in 17q21.31 microdeletion syndrome. KANSL1 is a subunit of a histone acetylatransferase (HAT) protein complex, and is required for its HAT activity. The mutations identified by both papers occur de novo and include nonsense and frameshift mutations.  Interestingly, a common feature of this microdeletion syndrome is a “happy, friendly disposition.”

 

Biological meaning? 

 

What is the biological meaning of these findings? Many of the somatic mutations in these chromatin remodeler proteins in cancer appear to be loss-of-function, but further in vivo and in vitro experiments will determine if these proteins act as tumor suppressors or oncogenes. It is also possible that different proteins can have either effect, depending on the type of cancer.  Mouse models of the developmental disorders noted here should bring further insights into how this class of proteins leads to these particular diseases.

 

Why does a loss-of-function somatic mutation in ARID1A cause so many types of cancer, and how can loss-of-function in this same gene also cause Coffin-Siris syndrome? How does loss of this class of proteins cause cancer in one context and intellectual disability, language impairment or autism in another? Is the sole case of a de novo frameshift indel mutation in ARID1B in autism relevant? If so, how does loss-of-function of ARID1B cause autism?  It is clear that much is still to be learned on how the epigenome regulates gene activity and how its misregulation in particular times and spaces can cause radically different severe diseases. Nevertheless, the most intriguing questions are these: 1) Where, and how, do these chromatin remodelers functionally overlap in these disparate diseases? And 2) Would a treatment for one disease be effective for another?

Somatic mutations in histone H3 in pediatric brain tumors

Glioblastoma multiforme (GBM) and diffuse intrinsic pontine glioma (DIPG) are aggressive subtypes of brain tumors that both have a very poor prognosis and are almost always lethal. Two new studies in Nature and this journal today identify the same recurrent mutations in H3F3A in pediatric cases of glioblastoma multiforme and diffuse intrinsic pontine glioma. These are the first reports of human disease associated with mutations in histones, which play an extraordinarily important and conserved role in chromatin structure and gene regulation. With the recent spate of papers reporting somatic mutations in chromatin remodelers in various types of cancer (examples from this journal alone include the histone H3K27 demethylase UTX, transitional cell carcinoma of the bladder, histone methyltransferase EZH2 in follicular and diffuse large B-cell lymphoma and myeloid disorders, DNMT3A in AML, ARID2 in hepatocellular carcinoma, MLL2 in DLBCL, and ARID1a in gastric cancer) it is clear that targeting the chromatin remodeling machinery will be an important area in the development of new cancer drugs.

 

Around 3000 children are diagnosed with brain tumors each year in the United States. DIPG is a type of aggressive brainstem tumor that occurs almost only in children. The Nature study reports somatic mutations in the H3.3-ATRX-DAXX chromatin remodeling pathway in 44% (21/48) of tumors, while the Nature Genetics study reports mutations in either H3.1 or H3.3 in 60% (52/86) of tumors. Remarkably, the Nature Genetics study finds that 78% (39/50) of DIPG tumors display a p.Lys27Met change in either histone H3.1 or H3.3.

 

Suzanne Baker and colleagues report in this journal today whole-genome sequences of 7 DIPGs and matched normal tissue. Four of the tumors harbored a p.Lys27Met change in H3.3 and one of the tumors showed a p.Lys27Met change in the related histone variant H3.1. The authors subsequently sequenced the genes encoding H3.3 (H3F3A) and H3.1 (HIST1H3B) in 43 more DIPGs and 36 non-brainstem pediatric glioblastomas. In total, 39/50 DIPGs carried a p.Lys27Met change in H3F3A or HIST1H3B. 13/36 non-brainstem pediatric glioblastomas harbored a p.Lys27Met change in H3F3A or HIST1H3B or a p.Gly34Arg change in H3F3A. The authors also sequenced the16 histone H3 genes in other types of pediatric brain tumors but found no other histone H3 mutations in these lower-grade tumor subtypes.

 

Nada Jabado and colleagues report in Nature today 48 whole exomes of pediatric GBMs, as well as matched normal tissue for 6 of those samples. Two of these samples harbored the p.Lys27Met change in H3.3 and 2 samples harbored a p.Gly34Arg change in H3.3. After extending the analysis to 48 whole exomes, the authors found that 44% (21/48) of samples harbored mutations in H3F3A, ATRX or DAXX. It is particularly notable that the two amino acids affected in H3.3 (K27 and G34) are at or in close proximity to sites that are important post-translational modifications. Trimethylation of K27 (H3K27me3) is associated with silencing of genes whereas K36 is associated with transcriptional activation.

 

Mutations in histones have not been reported in cancer (or in any other human disease), although somatic mutations in genes regulating histone modifications have been reported in cancer. It is clear that different histone variants are associated with different chromatin and transcriptional states. In particular, H3.3 is enriched at sites of active gene transcription and regulatory elements. Jabado and colleagues speculate that the finding of the same mutations in different tumors, plus the lack of truncating mutations, suggests that these mutations here are gain-of-function. However, the precise mechanism of action is hard to predict. Analysis of gene expression in 27 of the whole-exome samples shows that tumors with the K27 or G36 mutations have distinct profiles, suggesting that each mutation leads to tumors in different ways. Regardless, both papers show a central role for the chromatin remodeling machinery in pediatric gliomagenesis and mutations in histones as another way that epigenetic events drive cancer.