
A genomic database maintained by the US government may soon log its final entry – and many scientists can’t wait to see it go.
In a letter to staff members anonymously posted to the Phylogenomics blog, head of the National Center for Biotechnology Information, David Lipman, announced plans to shutter a database devoted to storing raw DNA sequencing data, a proteomics repository, as well as a software package used to make sense of genetic markers commonly used in forensics. The changes are necessary, he writes, because of expected cuts to NCBI’s budget. “It is regrettable that we have had to take this drastic action, but it has become unavoidable”.
However, some bioinformaticians are applauding the potential closure of the balky Sequencing Read Archive (SRA), which is tasked with storing the raw data now pouring out of DNA sequencers by the terabyte. Since the 1990s, genome sequencing centres have been expected to quickly upload sequencing data they generate for others to peruse. But next-generation DNA sequencing machines, which spit out loads of short sequences, or reads, have been harder for a central database to handle.
In comments to a recent post on the database on the Phylogenomics blog, scientists complained of difficulties uploading and downloading data to and from the archive. Genomic data from microbes living in the human body and in the environment were a particularly poor fit for the database. “SRA is in my opinion the single biggest obstacle to progress in microbial ecology and microbiome research today,” writes Rob Knight, of the University of Colorado in Boulder.
NCBI will continue funding SRA for four, or possibly eight months, but after that it’s unclear how raw sequencing data will be collected. On the Pathogens: Genes and Genomes blog, Nick Loman, a bioinformatician at the University of Birmingham, UK, supports a more decentralized database, where each sequencing centre is in charge of making their own data available and useful to others. “[I]t seems likely that no single repository is likely to be able to handle the sequencing output of the entire world for much longer,” he writes.
We’ve put out a call to NCBI to confirm the letter and comment on the news, and we will update this blog when we hear back.
Image courtesy of The Broad Institute