I couldn’t leave this be. I’ll be writing more about it soon, but these numbers are just staggering. David Altshuler gave a status report on the 1000 Genomes project, which aims to plumb the depths of human variation (I’m still waiting for the 1KG handle to take off). As it nears completion of its three pilot phase projects, it’s generated 3.8 trillion bases of genome sequence. Although they haven’t yet sequenced 1,000 genomes, that is technically 1,000 human genomes worth of sequence data. Altshuler said that if you take the amount of data that was in GenBank at the start of the project, they put in roughly that amount more for each week of September and October. And they’re just a tenth of the way there! It’s useful to remember, Altshuler said, that the Large Hadron Collider, which is similarly expected to heave terabytes of data at researchers, had copious amounts of planning going into its data handling and analysis. 1KG will need to catch up quickly, hence two recent requests by the NIH looking for people with a plan and a heck of a lot of computational power. See here and here.