Data sharing: Why it’s all ‘mine’

Data sharing makes scientific sense, but the career-conscious nature of scientists may stand in the way.

Guest contributor Rachel Yoho

As with many aspects of society, human nature shapes interactions in science research. When we consider “data sharing,” the likely response is probably a shrug. We’ve all been there. Group work and competition at its finest. The increasingly competitive environment for grant funding, and the ‘publish or perish’ attitude promotes the “mine, mine, mine” attitude among scientists. To focus on the issue of overcoming career-protecting objections to data sharing however, we can focus on several trends.

Data ownership
With many factors, including budget cuts, sequestration and economic downturns, the current scarcity of grant funding creates financial stress in labs. ”Big grants” like the NIH R01, had lower success rates for new grants in 2014 as compared to the last four of five years. In turn, data ownership becomes possessive to the PI and lab, even beyond that of the funding agency or institution. Simply, it’s our grant money, it’s our data. By working for and finally achieving a grant, often after many attempts, a sense of accomplishment and pride in ownership develops. Continue reading

Data sharing: Why it doesn’t happen

The advent of big data has caused scientists to rethink data sharing, but several problems are preventing it from happening, says Nina Divorty.

Guest contributor Nina Divorty

Data-sharing-naturejobs-blog“If I have seen further, it is by standing on the shoulders of giants.”  – Isaac Newton.

This classic quote sums up the nature of scientific collaboration: only by building on the work of our predecessors can we make scientific advancements, and only by sharing our own discoveries can they be built upon by others. Most researchers understand this, but only since the recent surge in technologies that generate very large datasets have we begun to recognise the value of sharing raw data, in addition to publishing results in their processed and polished form. The advantages are clear: raw data offers complete transparency so that other scientists can compare their own results and analyses when attempting to replicate findings, and also allows others to ask novel questions of existing datasets. Despite this, the majority of researchers across a variety scientific disciplines report that lack of access to data detracts from the progress of research in their field, yet 64% admit to not making their data easily accessible. So what’s stopping them? Continue reading

Digital lab notebook


brightcove.createExperiences();

“The conventional paper lab notebook is dead – or at least it’s on life support. With the advent of open electronic notebooks, data and methods are no longer cloistered in books or tucked away on private hard drives. But this gives the user some tradeoffs to consider. Read more about it on Nature Careers

To share or not to share

Many in the mass spectrometry community agree that MS data should be made publicly available for everybody’s benefit. All data, including the raw files generated by the mass spectrometers.

In the May editorial we support this request and introduce a new raw data repository run by the EBI that offers to replace the declining TRANCHE, up to very recently the only repository for such data.

Several good reasons can be made for making raw data available – one of them is the re-analysis of published data to validate claims. For example, the controversy arising in the wake of the analysis of fossilized Tyrannosaurus rex bones by Asara and colleagues  which led them to suggest that T. rex is more closely related to birds than to reptiles (Asara et al., Science 2007).  Their findings were finally corroborated in 2009 (Bern et al.J. Proteome Res.) but could have been examined much quicker,  if access to raw data had been given at the time of publication.

Re-analysis aside, raw data present a treasure trove of information that can be examined from different angles and, over time, with new tools that bring aspects to light that the original experimenters did not think of.  To create such new analysis tools, software developers rely on raw data to benchmark against established techniques.

Having access to raw files does not mean that they are easy to use – we realize that the diversity in file formats and the difficulty in converting one file type to another makes their analysis not as straight forward as it could be with a single community supported format.  And we also realize that these files are large and uploading them to the new EBI, or any other repository, will take time and some effort, particularly if important meta data about the experiment are included.

Still, we think the effort is worth it to ensure the field can move forward.  We’d love to hear your views, particularly if you disagree.