Cancer clones- mixing and spreading

Shah 1

McPherson et al., Nature Genetics 2016

The trajectory of tumor cells during metastasis can be influenced by many factors, including the physical environment and the genetic makeup of metastatic clones. In high-grade serous ovarian cancer, there are limited barriers in the intraperitoneal space, allowing for extensive spreading and mixing of tumor cells. A recent article published in Nature Genetics explores these different patterns of clonal evolution in metastatic ovarian cancer using a combination of bulk and single cell sequencing.

The authors characterized the mutation landscapes of different metastatic tumors and find both monophyletic and polyphyletic clones. While in most patients there was unidirectional seeding from the original ovarian tumor, two patients exhibited polyclonal spread and reseeding. Therefore, high-grade serous ovarian cancer cells can migrate through and establish metastasis within the intraperitoneal space via different evolutionary routes.

McPherson et al., Nature Genetics 2016

McPherson et al., Nature Genetics 2016

We spoke to lead author, Sohrab Shah, to get some background on this research.

What features of this particular cancer made you want to study its metastasis? Were you surprised by your findings?

High grade serous ovarian cancers are often widespread through the peritoneal cavity at diagnosis.  We wanted to ask what are the characteristics of cells that spread and what is the distribution of these cells throughout the abdominal lesions.  The focus was to study the disease state prior to any treatment to characterize the diversity and take in inventory of the ‘substrate’ of clones upon which treatment selective pressures may be acting.  Many patients experience relapse after initial response to treatment.  Mapping which clones lead to relapse remains a key question in the field.  This was borne out in one patient in our study where specific clones that led to relapses were already present at diagnosis but only represented a minority of branches in the clonal phylogeny.

It is important to note that the mode of spread in this disease differs from most solid cancers, where spread is achieved through the bloodstream or lymphatics.  Ovarian cancer represents a unique opportunity to study disease spread through a relatively physically unencumbered anatomic space.  One might expect that in such an environment the potential for clonal intermixing is high.   This might lead to many clones co-existing at many sites.  But the majority of intraperitoneal samples were clonally pure, suggesting unidirectional spreading from ovary sites with diverse clonal repertoires, and a lack of clonal intermixing.

You provide evidence that the microenvironment influences the metastatic success of tumors. What does this say about in vitro cancer models that don’t account for tissue context?

One of the intriguing findings suggested that specific clones were present in specific sites.  This may indicate that particular microenvironments are differently suited to particular clones. Another surprising finding was that every patient harbored at least one lesion that was very diverse in its clonal make-up (typically within primary ovary sites).  This leads to the natural question of whether properties of specific microenvironments in some way promote or ‘tolerate’ clonal diversity.  If this were the case, then both in vitro and in vivo model systems such as cell lines, organoids and mouse xenografts may not adequately represent the natural disease state we find in patients prior to treatment.

How did you choose your sampling strategy?

The study results are naturally biased by the sampling strategy.  The study design was subject to what material could be obtained during the provision of care.  In our setup, we consented patients for collection and study of all material removed at primary debulking surgery.  Wherever possible tissue was cryopreserved, but inevitably many deposits were preserved in formalin.   Our strategy led to acquisition of a median n=10 samples per patient.  The nature of the samples and their locations are presented in Figure 4 and are also available in interactive web-form at:

https://compbio.bccrc.ca/research/tumour-evolution/

Users can click on the links for each patient and explore the clonal maps.

You utilize both bulk and single cell sequencing as complementary approaches to elucidating tumor evolution. Can you comment on the trade offs between cost and throughput and how you chose your sample sizes?

The field is entering an interesting time.   There are several limitations to both bulk and single cell sequencing strategies to define the clonal constituents of a tumor sample.  Most single cell techniques suffer from vast under-sampling of the clonal repertoire since they are limited in throughput and may only practically yield data from 100s of cells.  Furthermore, single cell techniques are prone to two key experimental sources of noise: missing data and allele-dropout.  We used targeted, multiplexed single cell sequencing as a form of validation from inferences made from the bulk sampling including validating co-occurrence of point mutations and structural variations in the same cells.  Hypotheses were generated from multi-site bulk analysis and were then tested using orthogonal single cell approaches.   Accordingly, the sample sizes in single cell were chosen to identify clones that were detected in bulk samples – in the range of 5% prevalence.  Notably, the noise properties of targeted multiplexed single cell data required some careful statistical treatment, the results of which were published as a standalone contribution in Nature Methods simultaneously with this publication.  As the field moves forward, it may become practical to sequence the whole genomes of 1000s of cells per sample. I look forward to the day when a single experimental design would be sufficient to dissect the important clones present in a cancer.  This would enable studying evolutionary properties at scale, leveraging richly defined principles and statistical models from the field of population genetics.

You find that there are differences in the potential for migration and metastasis across the tumors from your patients. What clinical implications might this have?

Our study is underpowered to provide a clear answer on this.  Our results hint anecdotally that cases with strong patterns of unidirectional spread fared poorly in their treatment trajectories.  Whether cancers harboring clones with strong potential to invade new micro-environments and dominate their local landscapes indicates potential to evade chemotherapy remains an important question to consider.  As we take this study forward in model systems derived from spatially distinct sites, reproducible treatment selection experiments can be carried out to robustly address this question.

 

APOBEC3A takes the lead

A3A and A3B mutagenesis signatures

A3A and A3B mutagenesis signatures{credit}Dmitry Gordenin{/credit}

A paper published online today in Nature Genetics reports that the DNA-specific cytidine deaminase APOBEC3A (or A3A) is likely to be the major driver of APOBEC-mediated mutagenesis in human cancer. This finding is somewhat surprising because another deaminase, APOBECA3B (or A3B), has been considered the more likely mutator based on previous studies. Gene expression levels of APOBEC3B as well as mutagenic signatures in certain cancer types, such as breast cancer, have been consistent with a primary role for A3B in cancer-related mutagenesis. However, results of a recent paper by Serena Nik-Zainal et al. called this into question by showing that breast cancer samples from individuals with germline APOBEC3B deletions showed high levels of mutations consistent with APOBEC-dependent mutagensis.

Now, Dmitry Gordenin and colleagues expressed either A3A or A3B in a yeast reporter strain that allowed them to collect large numbers of mutations induced by these enzymes. Mutations were identified using whole genome sequencing and compared between the two enzymes. They were able to demonstrate that A3A and A3B induce mutations at specific genomic sequence motifs that could be reliably differentiated. Surprisingly, A3A tended to induce many more mutations than A3B, approximately 10-fold more. With the mutagenic signatures of the two enzymes at hand, they were able to show that A3A contributes to APOBEC-dependent mutagenesis in human cancers and may in fact be the primary driver of these mutations.

Click the link below for a video summary of the paper (created in collaboration with the authors):

An APOBEC3A hypermutation signature is distinguishable from the signature of background mutagenesis by APOBEC3B in human cancers from Research Square on Vimeo.

We asked two authors of the paper, Kin Chan and Dmitry Gordenin, to give us a little more background about this exciting new research:

Given that APOBEC3A is expressed at relatively low levels in cancer samples (compared to APOBEC3B), what motivated you to study the potential role of APOBEC3A in cancer rather than any of the other APOBECs?

From the very beginning, we did not have very much hope that the level of mRNA in tumors at the time of surgical excision would correlate strongly with the detected number of mutations induced by APOBECs in these tumors, because mutations detectable by sequencing would have formed much earlier.  We showed that mutation load was only weakly correlated with transcript abundances of both APOBEC3A and APOBEC3B.  In fact, we did not particularly favor the APOBEC3A versus APOBEC3B dichotomy model with respect to the identity of the major mutator in cancers when we started our yeast experiments.  We just wanted to get more precise estimates of their signatures in our yeast system, which was designed to enrich for accumulation of multiple APOBEC-induced mutation clusters as well as detecting scattered mutations.
Why do you think the distinct signature of APOBEC3A was not identified in previous studies, for example the study by Taylor et al.?

Yeast system reporting mutagenesis in ssDNA identifiedcommon and specific  components of A3A and A3B mutation signatures

Yeast system reporting mutagenesis in ssDNA identified common and specific components of A3A and A3B mutation signatures
{credit}Dmitry Gordenin{/credit}

In fact, Taylor et al. did notice differences between mutation signatures of single-strand (ss) DNA-specific APOBEC3A and APOBEC3B cytidine deaminases separately expressed in yeast.  However, they had significantly fewer mutations caused by APOBEC3A, which is less of a mutator as compared to APOBEC3B in the proliferating yeast used in that study. Our yeast system was devised to enable the facile study of mutations induced by APOBECs in stretches of ssDNA formed during growth of yeast cultures, along with mutations caused in long persistent stretches of subtelomeric ssDNA formed in response to regulated telomere uncapping.  The latter form of ssDNA is hypermutable by APOBECs, which results in formation of mutation clusters (also called kataegis by other groups) that are so characteristic of hypermutation caused by APOBECs in human cancers.  It is worth noting that Taylor et al. noticed that some samples of breast cancer had mutation spectra resembling that induced by APOBEC3A, while other spectra were more similar to APOBEC3B’s.  However, the statistical approach they used did not provide sufficient power to highlight individual samples with statistically significant enrichment for certain mutation signatures.

A significant factor to our success was the use of an analytical design described in our previous papers (Roberts et al. 2012 and Roberts et al. 2013).  The essential idea of this design is that it uses all available mechanistic knowledge emerging from our yeast experiments and from studies of other labs to formulate a stringent statistical hypothesis, which is then used to interrogate cancer datasets.  This approach allowed us to compute robust sample-specific p-values even for exome mutation catalogues, which contain around 1% of mutation numbers characteristic of the whole genome mutation load.

Were you surprised by the result that APOBEC3A may be responsible for ten times more mutations in cancer than APOBEC3B?

We certainly were, because when we made this discovery we were thinking that APOBEC3B was more likely to be the major mutator in cancers.  But upon re-reading the literature, the finding that APOBEC3A is actually the culprit makes sense:  Three groups had independently shown that ectopic overexpression of APOBEC3A causes many DNA breaks while similar overexpression of APOBEC3B made much, much fewer breaks.  We think that an important reason for APOBEC3A’s mutagenic prevalence in cancers is that some of these breaks are repaired by mechanisms generating long ssDNA intermediates—in other words, APOBEC3A substrates.  This would also be consistent with previous observations that APOBEC-signature mutation clusters frequently co-localized with chromosomal rearrangement breakpoints in cancers.

What are your biggest unanswered questions related to this study?

It is clearly the question about what molecular mechanisms underlie this strong bias towards APOBEC3A in cancer hypermutation.  However, this may require years of studies by many excellent labs that have already developed and continue to productively explore this field. Our work not only highlighted the strong influence of APOBEC3A in cancer mutagenesis, but also confirmed that APOBEC3B makes its own contribution, perhaps in even more cancers than APOBEC3A.  We are interested to explore new larger data sets of cancer mutations becoming available through the recently announced Pan-cancer Analysis of Whole Genomes  project to elucidate the roles of these APOBECs in different cancer types, stages of cancer development and regions of cancer genomes.

 How do you see others using these results, either in research or in the clinic?

We hope very much that our findings will stimulate development of new assays to measure protein levels of individual APOBECs in cancers, which may turn out to be a better predictor of hypermutation and of clinically important tumor features.  APOBEC3A- and APOBEC3B-specific antibodies required for such assays are still to be developed.  Another important area is biochemical studies of both enzymes, which may clarify why one of them can cause DNA breakage, while the other does so only inefficiently.  It will also be interesting to identify the interacting proteins that keep APOBEC3A in the cytosol of healthy cells, as this could lead directly to the reasons for APOBEC3A essentially going rogue and entering the nucleus to hypermutate genomic DNA in cancers.

As for clinical applications, determining the APOBEC mutagenesis signature of a tumor could inform decision making on personalized medicine:  a tumor where APOBEC3A is actively causing hypermutation might have to be treated very differently from a tumor where there is only APOBEC3B background levels of mutagenesis.  Screening for APOBEC signature mutagenesis in cell-free DNA for individuals at high risk (for example, patients with germline deletion of APOBEC3B) might be a useful early warning diagnostic in the near future.  Also, it’s straightforward to propose that a specific APOBEC3A inhibitor might be of value for personalized medicine, more so than a broad-spectrum APOBEC inhibitor, which would likely severely compromise innate immune function. In a more speculative sense, the idea of overexpressing an APOBEC in order to kill cancer by hypermutation catastrophe has been around for a while in the field.  The latest news in cancer research is that some hypermutated cancers are more susceptible to immune treatment than tumors with lower mutation loads.  The suggested explanation is in the creation of neo-antigens that trigger immune attack on the tumor.  Interestingly, therapeutic overexpression of APOBEC3A might combine this hypermutation effect with DNA breakage – a feature of several established cancer drugs.

Focus on TCGA Pan-Cancer Analysis

Nature Genetics is pleased to present today the first installment of our Focus on TCGA Pan-Cancer Analysis.

The Cancer Genome Atlas (TCGAhas analyzed over 8,000 cancer cases across 27 tumor types to date, and aim to have over 100,000 specimens analyzed by the of 2015. They have commendably made both data and exploration tools publicly available at https://www.cancergenome.nih.gov. They have previously published 8 papers reporting in-depth genomic characterization of individual tumor types.

The TCGA Pan-Cancer initiative, launched in October 2012 at meeting in Santa Cruz, California, seeks to combine analysis across tumor types in order to identify both similarities and differences in genomic alterations.  The work presented in this collection of Pan-Cancer publications includes analysis of the first 12 TCGA tumor types. This includes over 3,000 cancer patients profiled with 6 different platforms to assess genomic, transcriptional, epigenetic and proteomic alterations, combined with clinical data.  The authors demonstrate that while a majority of the tumor samples show unique genomic alterations, that by combining analysis they are able to both increase statistical power for the detection  of molecular drivers and to identify common pathways that are altered across tumor types.

The Pan-Cancer initiative provides a model for large-scale collaborative analysis as well as data sharing, bringing together over 250 collaborators from ~30 institutions working together on over 60 projects analyzing the same dataset.  These efforts required a strong collaborative framework, a commitment to rapid distribution of data, and means to facilitate shared analysis. Josh Stuart and colleagues provide an overview of this project in an accompanying Commentary.

This work also relied on the development of new bioinformatics tools and platforms, providing a foundation that should prove useful in future large-scale analysis projects. A Commentary by Larsson Omberg and colleagues highlights these approaches and the use of the Synapse software platform to share and evolve data, analysis and results among the Pan-Cancer Working Group. The Synapse platform was developed by Sage Bionetworks to facilitate open and data-driven collaborative research efforts, and is also being well used in DREAM challenges.  The use of this platform supported the discovery efforts reported in this collection of Pan-Cancer papers, which also provide a public resource of highly curated and standardized data sets across a series of data freezes along with automated analysis systems.

In the first of two Analysis papers published today in Nature Genetics, Chris Sander and colleagues provide a hierarchical classification of 3,299 tumors from 12 cancer types from the Pan-Cancer dataset, using a newly developed algorithmic approach. Their analysis separates tumors into those with primarily somatic mutations and those with primarily copy number alterations. They also identify oncogenic signatures that characterize ~30 tumor subclasses, which may suggest therapeutic targets of relevance across tumor types.

In a second Analysis published in Nature Genetics, Rameen Beroukhim and colleagues characterized somatic copy number alterations (SCNAs) in 11 cancer types and 4,934 primary cancer specimens from the Pan-Cancer dataset.  They observed whole-genome doubling in 37% of cancers, associated with higher rates of all SCNA.

We are pleased to support the TCGA Pan-Cancer efforts as a model for large-scale collaborative genomics projects combined with open data sharing, and demonstrating the ready benefits this can bring to our understanding of the molecular drivers of cancer.  The TCGA Pan-Cancer project continues to develop, and so will this Focus, so please get primed with this selection of publications and stay tuned.  In the meantime, here is a selection of social media and press stories: https://storify.com/obahcall/nature-genetics-pan-cancer-focus.

Preview to the 7th Genomics of Common Diseases

The 7th annual The Genomics of Common Diseases conference is taking place this weekend, from September 7-10, in Keble College, Oxford University. At this conference, we seek to represent a top selection of the latest research characterizing the genetic basis of a range of common diseases.

We held the first Genomics of Common Diseases conference in 2007, with a program that highlighted rapid advancements in identifying common variants associated with a range of common diseases, made possible by new methods enabling genome-wide association studies (GWAS). Over the past seven years, our understanding of the genetic architecture of disease has been progressively redefined by GWAS characterizing common variation, the fine mapping of associated regions, the emergence and growth of new sequencing technologies and the assessment of rare variant association. We have represented the progress in the field facilitated by rapid improvements in and reduced costs of genotyping and sequencing technologies. We have also seen rapid growth in the scale of genetic datasets, with the need to analyze progressively larger sample sizes. Our sixth annual conference focused both on presenting the latest applied technologies and on how to meet challenges posed by the analysis and interpretation of these large-scale genetic datasets. Continue reading