The Method of the Year for 2013 is… single-cell sequencing

Single-cell sequencing edged out other contenders as our choice of Method of the Year in 2013. These techniques really came into their own in 2013 and are fast providing new insights into the workings of single cells that ensemble methods are incapable of.

Method of the Year 2013Back in 2008 we chose next-generation sequencing as our Method of the Year not only because of how the new techniques would improve performance in conventional sequencing applications, but also because they opened up whole new applications, unthinkable with traditional Sanger sequencing. Our choice of Method of the Year in 2013 bears this out, as none of these single-cell sequencing applications would be possible without next-generation sequencing. And in some applications the sequencing is used almost exclusively for identifying and counting tagged molecules.

Our choice likely comes as a surprise to all those who were certain that we would pick CRISPR/Cas9 technology for targeted genome modification. This is certainly an exciting technology, and not only for genome engineering, but also for epigenome editing as described in a Method to Watch. But genome editing with engineered nucleases was our pick for the 2011 Method of the Year and although CRISPR/Cas9 provides a huge practical improvement by largely dispensing with the need to engineer the nuclease and relying instead on a programmable guide RNA, the advance over 2011 is mostly one of ease-of-use.

Methods to investigate biology at the level of single cells have been of keen interest to Nature Methods since the journal started. Our first research article from Robert Singer described a paraffin-embedded tissue FISH (peT-FISH) method to simultaneously detect expression of several genes in situ in single cells while maintaining tissue morphology (Capodieci, P. 2005). This was followed by many other imaging-based methods for such things as measuring cell growth (Groisman, A. 2006), quantifying mRNA (Raj, A. 2008) and protein (Gordon, A. 2006) levels, profiling intracellular signaling (Krutzik, P.O. & Nolan, G.P. 2006)(Loo, L.-H. 2007) and DNA insertion-site analysis (Schmidt, M. 2008) in single cells.

The number of original research articles published in Nature journals exploded in 2013

The number of original research articles published in Nature journals exploded in 2013. These numbers may not be complete.

The publication of M. Azim Surani’s article on mRNA-Seq whole-transcriptome analysis of a single cell (Tang, F. 2009) in 2009 helped signal the rise of sequencing-based methods for single-cell analysis. But even two years later the Reviews and Perspectives in our supplement on single-cell analysis were more focused on imaging-based than sequencing-based aproaches to single-cell analysis.

It was only in 2013 that we finally saw an explosion of original research articles using or reporting single-cell sequencing methods in Nature-family journals. Numerous studies reported new biological results that relied on sequencing of whole or partial genomes or transcriptomes from single cells.

Our Method of the Year special feature has three Commentaries by researchers in the field, including some of the earliest developers and users of methods for single-cell analysis. An Editorial, News Feature and Primer describe our choice and provide helpful background information. We hope you enjoy the selection of articles in our special feature.

A star is born: the updated Human Reference Genome

The release of the 38th build of the human reference genome gets a well-deserved rock-star greeting by the scientific community.

The new GRCh38 is already a rock-star

The new GRCh38 is already a rock-star{credit}Wikimedia Commons/Flickr:Starman/K.Spencer{/credit}

Fans know it is worth the effort to camp out for tickets to a concert by a beloved rock, pop or country star. GRCh38, the newest build of the human reference genome, is that kind of star. Delayed by a few snags and also held up by the US government shut-down, the sequence has just traveled to GenBank for use by the scientific community.

Not only has Genome Reference Consortium build 38 (GRCh38) eliminated some pesky previous gaps, it will be the first human reference assembly to have sequence information for centromeres. Up until now, centromeres, which are specialized structural components of chromosomes, have been represented in the reference by gaps of 3 million base pairs. The news about centromere sequence will be of interest to cell biologists and genomics researchers alike.

“This will be a major boon to evolutionary studies of human populations and to the many groups doing mechanistic work on human centromeres and kinetochores,” says Stanford University researcher Aaron Straight, whose work focuses on cell division and chromosome segregation. “Finally, now we can stop saying ‘mind the gap’.”

The reference genome finishers are the members of the Genome Reference Consortium (GRC) at the European Bioinformatics Institute, the US National Center for Biotechnology Information, The Wellcome Trust Sanger Institute and The Genome Institute at Washington University.

Scientists may not have physically camped like concert-goers in front of the buildings where genome finishers scurry to get the sequence out the door. But the throngs have been virtually present. The GRC, which works on human, mouse and zebrafish reference genomes, is “having to field a lot of questions from folks who want to know the minute they can have the assembly,” says Deanna Church, a genomicist formerly at the US National Center for Biotechnology Information and who has, since this interview, moved to Personalis, a genetic testing and analysis company.

The din has faded from the 2001 celebration marking the end of the Human Genome Project. But the sequence was not complete nor is it complete now. As colleagues at Nature Methods have pointed out here and here, the sequence originally had around 150,000 gaps.

The most recent reference genome, Genome Reference Consortium build 37 (GRCh37), has 357 gaps. And is missing sequence around the centromeres. No longer.

Come here, centromere
The structure and repetitive nature of centromeric regions has made them largely inaccessible to methods used to create the reference assembly, says Church. The concept and the methods to produce the centromere sequences for this reference build were developed by a research team at University of California at Santa Cruz (UCSC). They constructed sequences using the Sanger technique and the data helped the team behind GRCh38 to fill in these important gaps.

The centromere community will be happy to no longer say this.

The centromere community will be happy to no longer say this.{credit}Wikimedia Commons/Clicsouris{/credit}

In a paper, the UCSC team, led by Karen Miga and Jim Kent, a member of GRC’s scientific advisory board, noted that centromeric regions are replete with near-identical tandem repeats—satellite DNA. Difficult assembly of these regions have led them frequently to be excluded from genomic studies. In the new reference genome, the scientists used reads generated during the Venter genome assembly and created models for the centromeres, says Church.

“These models don’t exactly represent the centromere sequences in the Venter assembly, but they are a good approximation of the ‘average’ centromere in this genome,” she says. And these sequence models are not exact representations of any one centromere, either. But including these sequences in the reference assembly “will likely improve genome analysis using current methods, and allow for some further study of population variation in centromere sequences,” says Church.

Continue reading