Metagenomics sprang from advances in sequencing technology, and continued improvements are providing data in quantities unimaginable a few years ago. But without concerted efforts, the amount of data will quickly outpace the ability of scientists to analyse it. The September Editorial of Nature Methods (6, 623; 2009), ‘Metagenomics versus Moore’s law’ draws attention to some articles in the same issue of the journal that illustrate some of the dangers and problems, as well as the solutions that are being sought.
Three years ago, the Editorial continues, the first two second-generation metagenomes were reported at less than 40 megabases each. Now, there are more than 4,000 sequenced metagenomes that would take years or tens of years to analyse (depending on the processing power used). Major initiatives are needed to avoid metagenome-analysis gridlock: according to the Editorial, funding agencies need to increase support for data analysis; and the community needs to improve data-sharing through standards and centralized coordination and by aggregating computationally intensive operations. The conclusion:
“This summer, after discussions at the International Conference on Systems for Intelligent Molecular Biology, community members formed the M5 (metagenomics, metadata, metaanalysis, multiscale-models and metainfrastructure) Consortium under the roof of the Genomics Standards Consortium to devise a solution to the coming gridlock. Their proposed ‘M5 Platform’—to be announced later this year—deserves the support of the community, funding agencies and those who hold the keys to the high-performance computing centers. Unless major efforts are taken immediately, researchers will find they have a wealth of data but no way to interpret it.”
Readers’ comments and discussion of this Editorial are welcomed at Methagora, the Nature Methods blog.