Sharing data: Why it should be done

As data continues to be produced at staggering rates, scientists need to become more aware of the benefits of data sharing, says Eleni Liapi.

Guest contributor Eleni Liapi

CD-stack-naturejobs-blog

{credit}PhotoDisc/Getty Images{/credit}

The scientific community is currently experiencing an explosion in data generation. At CERN (the European Council for Nuclear Research), the rate of data production is 1 petabyte (=1015 bytes) per day inside the Large Hadron Collider (LHC), which is comparable to 210,000 DVDs.  At the European Bioinformatics Institute, 20 petabytes of biological data had been stored between 2004- 2012.  In the US alone, the volume of data produced by the healthcare industry in 2011 was estimated at 150 exabytes (=1018 bytes). Undoubtedly, this volume of information brings with it several problems, including data storage and sharing.

Access to data is a topic that initiates numerous discussions and opinions between scientists and other communities for a plethora of reasons, including concerns about inappropriate use, institutional or industrial restrictive policies where the gigabytes of obtained genomic data are to be utilised for pharmaceutical research, for example. To date, there have already been attempts to estimate the extent of the problem. In one survey, 67% of the participants expressed the view that inaccessible data hinder scientific progress. Continue reading