Data sharing can make a significant contribution to the scientific community, but it comes with challenges, says Caroline Weight.
Guest contributor Caroline Weight
We have all heard of it. We are all worried about it. We hear whispers of it in the corridors. We are advised to be careful what we say to ‘others’. We constantly check the literature. It matters to us. After all, it is our careers on the line.
‘Scooped’.
The process of publication is vigorous, competitive and tricky. It’s not uncommon for five years to pass between writing the grant application and publishing the work. Big labs with state-of-the-art facilities stand a better chance of getting their work out there first, given the extra manpower and often more-established protocols. This race for ownership of the data makes it difficult to share information and present new findings at meetings or conferences. Even at manuscript submission, there is often a chance to actively inhibit particular referees in case of conflicts of interest or personal competitors, to retain the novel concepts and data until they have been made public. Not until the publication has been accepted and is in print can you heave a sigh of relief and move on to the next project. Yet, sharing of data is essential to the progression of science in the modern world.
This is because building on results from peer reviewed publications allows scientists to push the boundaries and discover ever more ground breaking solutions to complex scientific challenges. Data sharing allows the development of new ideas and different angles of approach to be addressed. For example, combined lab meetings, departmental seminars, invited speaker seminars and conferences all provide excellent opportunities to communicate your work to a larger audience within and between different fields or focusses of research. Feedback from such meetings can be invaluable as new insights to their interpretation of your data result in advancement of the project, not only for the presenter but also for the attendees. In addition, potential queries can be addressed before manuscript submission. Following manuscript acceptance and publication, data sharing can make a significant contribution to the scientific community.
A researcher can request data, protocols or reagents after the scientific paper becomes public. This can be extremely useful as maybe you were struggling with optimizing a protocol, or needed a particular antibody, or perhaps found the exact opposite to the findings just published. Here, it would be beneficial to identify a reason why such different results occurred. In addition, meta-analyses require large data sets from multiple cohort studies. Data sharing in this respect provides useful and time-efficient methods for building and contributing to a larger study.
This can come with complications. Perhaps the published study yielded excellent potential for further investigation and grant applications that the principal investigator wants to keep secure within their lab before releasing the information to the world. Alternatively one could also argue that the time spent preparing the data for sharing outweighs the reward. Again, the opportunity to keep what you have until it is golden creeps in, in order to protect a career that was so difficult to establish to begin with.
These concepts of database sharing have been tested within the scientific community. Tenopir et al. published a study based on a survey of over 1000 international scientists. Amongst other things, they found that lack of awareness or training on how to share or receive data was an important factor in preventing the flow of information. Accessible databases for long-term data storage could be standard practice in the future. At present, there are limited requirements for published data to be stored for public viewing and the time taken to submit original data sets is too demanding. But the practice would be a good method to prevent result fabrication whilst verifying data integrity.
Understanding how groups collect and interpret results would be informative, thought provoking and, with the rapid evolution of electronic programming, allow the development of new methodologies to collect and process results that can be tried and tested.
Data sharing is not as simple as it seems. The system is such that too little restriction on sharing can ultimately determine the success of an individual, a project or a labs future. Change is inevitable but the implementation of change may take more convincing.
Caroline Weight is a winner of the 2015 Scientific Data writing competition. She is now working as a postdoc at the Division of Infection and Immunity at UCL. Caroline enjoys travelling, playing badminton and entering science competitions in her spare time!
Recent comments on this blog
African astronomy and how one student broke into the field
From Doctorate to Data Science: A very short guide
Work/life balance: New definitions