Alice McHardy, Alex Sczyrba and Thomas Rattei announce a new initiative for assessing metagenomics methods in this guest post.
In just over a decade, metagenomics has developed into a powerful and productive method in microbiology and microbial ecology. The ability to retrieve and organize bits and pieces of genomic DNA from any natural context has opened a window into the vast universe of uncultivated microbes. Tremendous progress has been made in computational approaches to interpret this sequence data but none can completely recover the complex information encoded in metagenomes.
A number of challenges stand in the way. Simplifying assumptions are needed and lead to strong limitations and potential inaccuracies in practice. Critically, methodological improvements are difficult to gauge due to the lack of a general standard for comparison. Developers face a substantial burden to individually evaluate existing approaches, which consumes time and computational resources, and may introduce unintended biases.
The Critical Assessment of Metagenome Interpretation (CAMI) is a new community-led initiative designed to help tackle these problems by aiming for an independent, comprehensive and bias-free evaluation of methods. We are making extensive high-quality unpublished metagenomic data sets available for developers to test their short read assembly, binning and taxonomic classification methods. The results of CAMI will provide exhaustive quantitative metrics on tool performance to serve as a guide to users under different scenarios, and to help developers identify promising directions for future work.
As a community effort, we encourage feedback by both method developers and users of metagenome analysis tools. The CAMI initiative was one of the four discussion threads of the Metagenome Meeting at the Newton Institute in Cambridge this year. Another open discussion with both developers and users of computational metagenome methods will also take place at a roundtable at the ISME conference in Seoul in August.
We urge developers to participate by registering for the competition on our website and joining our Google+ group to provide feedback on the current design phase. The competition is tentatively scheduled to open at the end of 2014. Key data sets are being generated, and CAMI is currently seeking additional data contributors to provide genomes of deep-branching lineages for data set generation. The results will be presented and discussed in a workshop a few months after the competition. We aim for a joint publication of the generated insights together with all CAMI contest participants and data contributors.
We encourage everyone to get involved and spread the word!