The Method of the Year for 2013 is… single-cell sequencing

Single-cell sequencing edged out other contenders as our choice of Method of the Year in 2013. These techniques really came into their own in 2013 and are fast providing new insights into the workings of single cells that ensemble methods are incapable of.

Method of the Year 2013Back in 2008 we chose next-generation sequencing as our Method of the Year not only because of how the new techniques would improve performance in conventional sequencing applications, but also because they opened up whole new applications, unthinkable with traditional Sanger sequencing. Our choice of Method of the Year in 2013 bears this out, as none of these single-cell sequencing applications would be possible without next-generation sequencing. And in some applications the sequencing is used almost exclusively for identifying and counting tagged molecules.

Our choice likely comes as a surprise to all those who were certain that we would pick CRISPR/Cas9 technology for targeted genome modification. This is certainly an exciting technology, and not only for genome engineering, but also for epigenome editing as described in a Method to Watch. But genome editing with engineered nucleases was our pick for the 2011 Method of the Year and although CRISPR/Cas9 provides a huge practical improvement by largely dispensing with the need to engineer the nuclease and relying instead on a programmable guide RNA, the advance over 2011 is mostly one of ease-of-use.

Methods to investigate biology at the level of single cells have been of keen interest to Nature Methods since the journal started. Our first research article from Robert Singer described a paraffin-embedded tissue FISH (peT-FISH) method to simultaneously detect expression of several genes in situ in single cells while maintaining tissue morphology (Capodieci, P. 2005). This was followed by many other imaging-based methods for such things as measuring cell growth (Groisman, A. 2006), quantifying mRNA (Raj, A. 2008) and protein (Gordon, A. 2006) levels, profiling intracellular signaling (Krutzik, P.O. & Nolan, G.P. 2006)(Loo, L.-H. 2007) and DNA insertion-site analysis (Schmidt, M. 2008) in single cells.

The number of original research articles published in Nature journals exploded in 2013

The number of original research articles published in Nature journals exploded in 2013. These numbers may not be complete.

The publication of M. Azim Surani’s article on mRNA-Seq whole-transcriptome analysis of a single cell (Tang, F. 2009) in 2009 helped signal the rise of sequencing-based methods for single-cell analysis. But even two years later the Reviews and Perspectives in our supplement on single-cell analysis were more focused on imaging-based than sequencing-based aproaches to single-cell analysis.

It was only in 2013 that we finally saw an explosion of original research articles using or reporting single-cell sequencing methods in Nature-family journals. Numerous studies reported new biological results that relied on sequencing of whole or partial genomes or transcriptomes from single cells.

Our Method of the Year special feature has three Commentaries by researchers in the field, including some of the earliest developers and users of methods for single-cell analysis. An Editorial, News Feature and Primer describe our choice and provide helpful background information. We hope you enjoy the selection of articles in our special feature.

Stephen Quake responds to Lior Pachter

Stephen Quake responds to a blog post by Lior Pachter that analyzes data from his recent analysis of single-cell RNA sequencing methods published in Nature Methods.

In October, we published an Analysis by Quake and colleagues that evaluated a number of single-cell RNA-seq approaches on the basis of their sensitivity, accuracy and reproducibility. In a subsequent blog post, Pachter challenged their data reporting. At issue is whether the failure rate among 96 samples sequenced using the Fluidigm C1 microfluidic instrument should have been presented differently.

We encourage animated discussion of published research and hope that this can serve as a useful forum. In this guest post, Quake responds to Pachter’s blog entry. The views expressed below are solely his and do not necessarily represent those of Nature Methods.

Stephen Quake Methagora blog postIn a recent blog post, Lior Pachter appears to question my scientific integrity and suggest that I unfairly manipulated data in a recent publication on single cell RNAseq.

Pachter has not contacted me directly with his questions nor did he give any warning before publishing his blog post. While I am happy that he is carefully scrutinizing publications and independently re-analyzing primary data, his rather sensationalistic approach to reporting his results in the absence of discussion or peer-review risks doing a disservice to science and adds more heat than light.

Pachter tries to have it both ways – based on our published data he accuses me of 1) wasting effort by sequencing lower quality samples and 2) selectively publishing data from only the better samples. It is hard to see how these accusations can simultaneously both be true. As described in the methods section of our paper, the C1 capture rate is not perfectly efficient and therefore we manually inspected all the chambers. We found 93 chambers had single cells, 1 chamber had two cells, and 2 chambers had no cells. Of the 93 chambers with single cells, 91 of the cells appeared to be alive as measured by a live/dead stain and 2 did not. Our single cell RNAseq experiments included all 91 of the “live” single cells and 1 of the “dead” single cells; the data from the latter was indistinguishable from the former and thus it was included in all further analyses. There was absolutely no selection or manipulation of the data. All of the raw data as well as our R scripts were made available for Pachter and others to download and analyze upon publication of our paper.

The sequencing library prep and workflow that we use is geared around 96 parallel samples and we decided it would be valuable to process control samples in exactly the same batch as the single cell samples. We therefore included four control samples with the single cells: amplification products from a chamber on the chip that did not have a cell (C09, which was unfortunately not given a distinguishing filename during the file upload), a single cell tube amplification, a no template control (NTC, C70) tube experiment that did not have a single cell, and a bulk control sample. Pachter correctly points out that C70 is dominated by the ERCC spike in controls and has essentially no human transcripts as expected; similarly, the other negative control C09 performs very poorly next to the actual single cell data. It is not clear to me why Pachter thinks I should be embarrassed for performing negative control experiments; indeed biochemical amplifiers are known to be so sensitive that there are many stories of contamination that occurs through aerosol dispersal from nearby benches, etc. In our own analyses C09 and the other controls were excluded from the single cell data.

Pachter also noticed that ~ 3 of the single cell RNAseq experiments have significantly lower quality than the other 89, as measured by fraction of spike in sequenced or by log-correlation coefficient. If taken at face value, this corresponds to a failure rate of 3/92, or 3%. The experiments therefore had a 97% success rate by this metric and it is hard to see where his complaint lies. We conservatively included ALL of the single cell data in our analyses and thus if one follows Pachter’s prescription to only analyze the experiments that he deems “successful”, then the results will be even better than we reported.

Finally, Pachter makes a misleading argument concerning the statistical methods used to generate figure 4a. This figure is concerned with the questions of whether an ensemble of single-cell RNAseq experiments produces similar gene expression values as a bulk experiment. The reason for sub-sampling to equal depth is worry of introducing artifacts by comparing two RNAseq experiments of dramatically differing sequencing depth (see e.g. Cai, Guoshuai, et al. “Accuracy of RNA-Seq and its dependence on sequencing depth.” BMC bioinformatics 13.Suppl 13 (2012) and Tarazona, Sonia, et al. “Differential expression in RNAseq: a matter of depth.”Genome research 21.12 (2011): 2213-2223.). This figure has little to do with estimating the quality of the individual RNAseq experiments.