Reproducible proteomics

Our June issue, which went live online yesterday, includes an Analysis paper describing the results of a large-scale study to try to get to the root causes of irreproducibility in mass spectrometry-based proteomics. Despite the novel and valuable biological applications made possible by proteomics and the continuing impressive technological advances in mass spectrometry, the technology has been unable to completely shed its reputation of being poorly reproducible.

To attempt to pinpoint the sources of irreproducibility, John Bergeron and colleagues, as part of a Human Proteome Organization (HUPO) effort, sent a test sample consisting of 20 purified proteins at equal concentrations to 27 different proteomics labs. The study designers asked these labs to identify the 20 proteins by whatever mass spectrometry instrumentation and workflows they were used to using. Initially, only 7 labs correctly reported all 20 proteins! However, when the study designers re-analyzed the data from the labs that failed in the task, they found that almost all actually did have mass spectra for all 20 proteins in hand. Most of the problems therefore stemmed from the database searching approaches used to go from the raw spectra to a protein identification. Many of the labs also reported ‘false positives’ – proteins that were not actually in the test samples. However, it turned out that many of these false positives were real; they were contaminants introduced during the sample handling process.

This study reaches several interesting conclusions. First of all, and reassuringly, the authors found that the mass spectrometry technology itself is reproducible. However, because of the number of complicated steps required to go from an unknown sample to a protein identification, the success of each of the groups varied widely, demonstrating the need for careful sample handling and proper training. The authors also state that improvements in database search engines and the proteomic databases themselves are direly needed.

This work also shows the value of examining the reproducibility of new technologies and methods on a large scale, especially between labs, using carefully prepared test samples. These studies can be expensive and time-consuming, but they are highly beneficial. Broad guidelines for Analysis papers are provided in our April 2008 editorial, and authors interested in submitting such studies are encouraged to contact the editors beforehand.

This Analysis will be freely available for one month. Be sure to also check out the News and Views article by Ruedi Aebersold which accompanies the paper.

Tools for Drosophila

The June issue of Nature Methods goes live today, with several papers, coincidentally, reporting on methods for the study of fruit flies.

As discussed in our editorial, Michael Dickinson, Pietro Perona and colleagues use machine vision to track individual flies as they interact within a group (Branson et al); this may prove useful to study social behaviour and how it is influenced by specific genes or neural circuits. In a News and Views Michael Reiser proposes that, with approaches such as these, phenotyping methods may be catching up with the well-established molecular genetics toolkit available for the fly. But of course, improvements in genetic resources are still needed. Also in the June issue, two groups (those of Hugo Bellen and Pavel Tomancak) present libraries that cover the entire genomes of Drosophila melanogaster and Drosophila pseudoobscura: Venken et al present high-coverage BAC libraries in the P[acman] system for D. melanogaster and Ejsmont et al present fosmid libaries for D. melanogaster and D. pseudoobscura. These resources allow modification of genes by recombineering (for instance, to make mutants or to add tags for visualization) as well as integration into precise sites in the genome, and should facilitate a wide variety of studies – including behavioural analyses – that require transgenesis in the fly.

Highlights of methods in the recent literature

Our June issue will be published a week from now on May 28, and as always it will include the popular Research Highlights section where we write short news stories about interesting methods described in the recent literature. Of course we are not able to highlight every interesting methods paper we find, so here are some others you may want to check out. Stay tuned for the June issue to see what we picked to highlight in the journal!

The automation of science

Science 324, 85 – 89 (2009)

Ligand-directed tosyl chemistry for protein labeling in vivo

Nat. Chem. Biol. 5, 341 – 343 (2009)

Developmental programming of CpG island methylation profiles in the human genome

Nat. Struct. Mol. Biol. 16, 564 – 571 (2009)

Enzyme cascades activated on topologically programmed DNA scaffolds

Nat. Nanotechnol. 4, 249 – 254 (2009)

Evidence for antisense transcription associated with microRNA target mRNAs in Arabidopsis

PLoS Genet. 5, e1000457 (2009)

A molecular barcoded yeast ORF library enables mode-of-action analysis of bioactive compounds

Nat. Biotechnol. 27, 369 – 377 (2009)

Two-dimensional IR spectroscopy and isotope labeling defines the pathway of amyloid formation with residue-specific resolution

PNAS 106, 6614-6619 (2009)

Localization of inner hair cell mechanotransducer channels using high-speed calcium imaging

Nat. Neurosci. 12, 553 – 558 (2009)

High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues

Genome Res. published online 17 April 2009

Diversity-based, model-guided construction of synthetic gene networks with predicted functions

Nat. Biotechnol. 27, 465 – 471 (2009)

Single-molecule electrocatalysis by single-walled carbon nanotubes

Nano Lett. published online 14 April 2009

Serial time-encoded amplified imaging for real-time observation of fast dynamic phenomena

Nature 458, 1145 – 1149 (2009)

Fabricating genetically engineered high-power lithium ion batteries using multiple virus genes

Science 324, 1051 – 1055 (2009)

Top downloads for April ’09

We’ve now been posting information on the most-downloaded papers in Nature Methods for three months now so I think we can dispense with the introduction. Please see the earlier posts for basic information on how these rankings were determined.

As I have commented previously, the next-generation sequencing papers continue to generate the most interest and occupy the top three spots for most-downloaded papers published in the April issue. A variation on an established, but relatively unknown, scanning probe method for imaging the surface topography of cells squeeked in at number four.

Top 4 research papers published in the April issue

1. Global mapping of protein-DNA interactions by digital genomic footprinting

2. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes

3. Quantification of rare allelic variants from pooled genomic DNA

4. Nanoscale live-cell imaging using hopping probe ion conductance microscopy

[Update on 5/13/09: A question came up regarding the popularity of Correspondences. If these are included in the above list, the Correspondence describing the UCSC Cancer Genomics Browser would be ranked #1.]

There has been very little movement in the list of most popular papers published in months prior to the issue month being analyzed. The top seven positions just moved around a bit. Only the last three positions see the appearance of new arrivals. The Lifeact paper makes an appearance due to a Correspondence published in the May issue of Nature Methods and the proteomics paper has retained its popularity after being published in the March issue. A surprise appearance is made by a paper we published in August 2006. This is the STORM super-resolution imaging paper that scooped the competing PALM paper in Science by one day. Looking at data from past months shows that this paper has been hanging out just under the number ten spot but has been seeing a slow but steady increase in the number of downloads. It is tempting to speculate about the cause of this but I’ll avoid doing so.

Top 10 research papers published prior to the April issue

1. Mapping and quantifying mammalian transcriptomes by RNA-Seq

2. Stem cell transcriptome profiling via massive-scale mRNA sequencing

3. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data

4. Stable knockdown of microRNA in vivo by lentiviral vectors

5. Photoactivatable mCherry for high-resolution two-color fluorescence microscopy

6. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing

7. miRNA in situ hybridization in formaldehyde and EDC-fixed tissues

8. Lifeact: a versatile marker to visualize F-actin

9. Quantitative interaction proteomics using mass spectrometry

10. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM)

Author responsibilities

For those of you who may have missed it, on April 30th Nature and the Nature research journals — including Nature Methods — announced a change in policy regarding the duties of lead authors. The changes are explained in a Nature editorial and have been implemented in our Guide to Authors. A detailed explanation of the Nature journals’ new authorship policy can be found here.

These changes are meant to clarify the responsibilities of our authors to help ensure that the results and conclusions in Nature journal papers accurately reflect the original data, that this data is preserved, and that appropriate actions are taken to ensure the availability of materials — including algorithms — needed to replicate the work.

We will also be requiring that authors provide statements of authors’ contributions in Nature Methods papers submitted after April 30. Since Nature Methods began asking for this in early 2007 the number of authors providing this statement has already increased from 50% in May 2007 to 80% in May 2009 so we hope that making this a requirement won’t be problematic.

We realize that these changes could be seen as an added burden on authors but we don’t believe the expectations exceed what most readers expect of work published in Nature Methods or the other Nature journals and we hope our authors agree.