Nature Genetics is pleased to present today the first installment of our Focus on TCGA Pan-Cancer Analysis.
The Cancer Genome Atlas (TCGA) has analyzed over 8,000 cancer cases across 27 tumor types to date, and aim to have over 100,000 specimens analyzed by the of 2015. They have commendably made both data and exploration tools publicly available at http://www.cancergenome.nih.gov. They have previously published 8 papers reporting in-depth genomic characterization of individual tumor types.
The TCGA Pan-Cancer initiative, launched in October 2012 at meeting in Santa Cruz, California, seeks to combine analysis across tumor types in order to identify both similarities and differences in genomic alterations. The work presented in this collection of Pan-Cancer publications includes analysis of the first 12 TCGA tumor types. This includes over 3,000 cancer patients profiled with 6 different platforms to assess genomic, transcriptional, epigenetic and proteomic alterations, combined with clinical data. The authors demonstrate that while a majority of the tumor samples show unique genomic alterations, that by combining analysis they are able to both increase statistical power for the detection of molecular drivers and to identify common pathways that are altered across tumor types.
The Pan-Cancer initiative provides a model for large-scale collaborative analysis as well as data sharing, bringing together over 250 collaborators from ~30 institutions working together on over 60 projects analyzing the same dataset. These efforts required a strong collaborative framework, a commitment to rapid distribution of data, and means to facilitate shared analysis. Josh Stuart and colleagues provide an overview of this project in an accompanying Commentary.
This work also relied on the development of new bioinformatics tools and platforms, providing a foundation that should prove useful in future large-scale analysis projects. A Commentary by Larsson Omberg and colleagues highlights these approaches and the use of the Synapse software platform to share and evolve data, analysis and results among the Pan-Cancer Working Group. The Synapse platform was developed by Sage Bionetworks to facilitate open and data-driven collaborative research efforts, and is also being well used in DREAM challenges. The use of this platform supported the discovery efforts reported in this collection of Pan-Cancer papers, which also provide a public resource of highly curated and standardized data sets across a series of data freezes along with automated analysis systems.
In the first of two Analysis papers published today in Nature Genetics, Chris Sander and colleagues provide a hierarchical classification of 3,299 tumors from 12 cancer types from the Pan-Cancer dataset, using a newly developed algorithmic approach. Their analysis separates tumors into those with primarily somatic mutations and those with primarily copy number alterations. They also identify oncogenic signatures that characterize ~30 tumor subclasses, which may suggest therapeutic targets of relevance across tumor types.
In a second Analysis published in Nature Genetics, Rameen Beroukhim and colleagues characterized somatic copy number alterations (SCNAs) in 11 cancer types and 4,934 primary cancer specimens from the Pan-Cancer dataset. They observed whole-genome doubling in 37% of cancers, associated with higher rates of all SCNA.
We are pleased to support the TCGA Pan-Cancer efforts as a model for large-scale collaborative genomics projects combined with open data sharing, and demonstrating the ready benefits this can bring to our understanding of the molecular drivers of cancer. The TCGA Pan-Cancer project continues to develop, and so will this Focus, so please get primed with this selection of publications and stay tuned. In the meantime, here is a selection of social media and press stories: http://storify.com/obahcall/nature-genetics-pan-cancer-focus.