Bioinformaticians today published a mammoth evaluation of genome assemblers — computer programs that aim to piece together short DNA sequence reads into complete genomes.
Their work, described in the journal GigaScience, was conducted for the second Assemblathon, a contest designed to compare and evaluate competing genome assemblers. In the current round of the contest, which started in July 2011, 21 teams submitted 43 attempts to assemble three genomes from scratch: that of a bird (budgerigar), a fish (the Lake Malawi cichlid) and a snake (the boa constrictor).
One notable finding from the contest was that different assemblers — and the same assemblers in the hands of different teams — did not give consistent results. That echoes the results of Assemblathon 1, which wrapped up in 2011. But the problem itself may be more significant now than it was then, owing to the democratization of genomics, with many more labs now using many more methods to assemble many more genomes from scratch.
Perhaps because of this, Assemblathon 2 has sparked a bit of soul-searching among bioinformaticians, who have debated its results and their significance since a preprint of the paper was posted on arXiv in January.
Bioinformatician C. Titus Brown of Michigan State University in East Lansing, who reviewed the paper, published his review and wrote on his blog in February: “the biggest outcome of the Assemblathon 2 paper can be stated quite simply: we’re doing it all wrong, in bioinformatics…as a field, we have pretended that genome assembly is a reliable exercise and that the results can be trusted; the Assemblathon 2 paper shows that that’s wrong.”
Keith Bradnam of the University of California in Davis, the paper’s first author, doesn’t fundamentally disagree with that take: “I agree that the science community should be better at explaining that genomes and genome assemblies are the results of individual experiments that are rarely ever replicated. Trust them at your peril,” he commented on Brown’s post.
This isn’t an ideal situation for the average scientist who just wants to know which is the best tool to use for a specific project. On the blog Haldane’s Sieve, Bradnam compares the process of selecting an assembly method to that of choosing the best pizzeria in Davis.
“[T]he notion of a ‘best’ pizza is highly subjective and the best pizza for one person is almost certainly not going to be the best pizza for someone else,” Bradnam writes.
“Just as it might be hard to find somewhere that sells an inexpensive gluten-free, vegan pizza that’s made with fresh ingredients, has lots of toppings and can be quickly delivered to you at 4:00 am, it may be equally hard to find a genome assembler that ticks all of the boxes that you are interested in.”
Follow Erika on Twitter @Erika_Check.