Should every research group have a full time data-handler?
I’m persuaded after hearing about the multiple pitfalls associated with handling scientific data in modern research. “We are in the midst of a revolution in the way that data is handled and shared,” said Phil Sharp of MIT.
Sharp was speaking at a ‘late-breaking’ session put together in response to the dissemination of e-mails from the UK’s Climatic Research Unit late last year. The content has been seized upon by climate skeptics who have used it to question manmade climate change, and challenge the way that scientists behaved in terms of manipulating and presenting their data.
So what is ‘responsible’ data handling? Well, the research community is only just working that out. At the moment, it’s often not clear who owns data (the PI, the university, the journal, no-one?), or who is responsible for storing, sharing, securing and maintaining it. And there’s a lot of it: an ‘exaflood’ to use one fitting term.
Sharp was called in because he co-chaired a National Academy of Sciences committee that examined some of the issues around research data, and digested them in a report last year: Ensuring the Integrity, Accessibility and Stewardship of Research Data in the Digital Age.
One point that struck me is that data handling needs to be considered at the planning stages of an experiment, rather than scrambling around once results are rolling in. But given how complex and time-consuming it is to meet all the required standards (and to work out the standards where there are none) that can add up to a big burden.
That’s why the suggestion to designate or even hire a full-time professional makes so much sense — a data technician, just like many labs already employ a lab technician. That costs $$, of course. But perhaps in some cases a data-handler could be shared across labs, or institutions could set up central data-handling facilities much like they have shared sequencing or microscopy facilities.
“There’s not a research institution in the country that isn’t challenged by how to store data,” Sharp said.
For a previous blog post on the data report see here.