This week, Nature and the Nature research journals made some important updates to their data availability policies: updates that strengthen the editorial links between Nature journals and Scientific Data; updates that provide better resources and support for authors wishing to better support reproducible research; and updates that leverage the work of Scientific Data to curate datasets and identify suitable data repositories for more authors. See the related editorial published at Nature.
In particular, these journals are further strengthening their support for public data repositories. Like Scientific Data, the Nature journals believe large datasets should ideally be archived in community-supported, specialized data repositories. To help aid their authors, Nature journals now refer them to the list of recommended data repositories maintained by Scientific Data.
The Nature journals are also rolling out a new policy whereby they encourage authors of accepted articles, to submit a Data Descriptor to Scientific Data when the Editors feel a Data Descriptor would be beneficial. This provides an incentive for authors who are willing to go beyond Nature and the Nature research journals’ data sharing policies, and maximizes the reuse value of data associated with manuscripts at the Nature-titled journals.
Specifically, their updated data availability policy now states:
“Nature journals encourage authors to consider the publication of a Data Descriptor in Scientific Data to increase transparency and enhance the re-use value of data sets used in their papers. Data Descriptors are designed to be complementary to a primary paper and can be published prior to, simultaneously, or after publication of the primary paper. Nature journals will not consider prior Data Descriptor publications to compromise the novelty of new manuscript submissions, as long as those manuscripts go substantially beyond a descriptive analysis of the data and report important new scientific findings appropriate for the journal. (This policy does not necessarily extend to journal articles whose primary purpose is to describe a new data set or resource.)”
There are a number of reasons why an additional data-focused publication at Scientific Data can be valuable to the scientific community:
- More raw data: Authors have the opportunity to present unprocessed data, which is often not fully presented with original research papers, allowing for deeper reanalysis of important findings by others.
- New data: Authors may share and describe additional datasets that they created but which did not fit or were only tangentially related to the content in their paper at a Nature-titled journal.
- Updated data: Many important datasets grow and evolve with time. A subsequent publication at Scientific Data can be used to provide an update on important datasets.
- More data into public repositories: Scientific Data helps authors get their data, which may already be in the Supplementary Information section of the original paper, into an appropriate data repository, where it will receive a persistent identifier, and can be properly curated and structured to maximize reuse value.
- Data-focused peer review: Our peer reviewers evaluate datasets based on their suitability for wider reuse, not just whether they support specific research findings. This constructive evaluation process can significantly improve how a dataset is presented and structured, ensuring that it is genuinely “safe” for wider use. Data peer review is also one of the features of Scientific Data that researchers have told us they value most.
- Curated metadata: Our in-house curator helps authors create machine-readable metadata files supporting each manuscript, helping power-users search and mine our published datasets.
- Additional methodological detail: Authors can expand on their methods descriptions, providing detailed explanations of exactly how the key data were created.
- Credit: Authors who are willing to invest in making their data maximally open and reusable, gain recognition through an additional peer-reviewed publication. The paper at Scientific Data can have additional authors, or different author ordering, compared to the original research work, to credit appropriately those that put the most effort into sharing the data.
- Tutorials: Authors can include helpful tutorials for users of the data, making a Data Descriptor at Scientific Data akin to a ‘user-guide’ for important datasets.
- Facilitating reproducible research: As well as the benefits to those who generate and reuse research data, Data Descriptors are part of a broader movement and motivation to carry out and publish better science.
For these reasons and more, we feel that data publications are good for the scientific community and help reward those scientists who are truly committed to sharing the outputs of their research in a transparent and reproducible manner.
Please browse below a selection of the Data Descriptors we have already published in association with publications at Nature and the Nature research journals:
Systematic global assessment of reef fish communities by the Reef Life Survey program
Graham J Edgar & Rick D Stuart-Smith
27 May 2014, doi:10.1038/sdata.2014.7
Genomes and phenomes of a population of outbred rats and its progenitors
Amelie Baud, Victor Guryev, Oliver Hummel, Martina Johannesson, The Rat Genome Sequencing and Mapping Consortium & Jonathan Flint
24 June 2014, doi:10.1038/sdata.2014.11
A Southern Indian Ocean database of hydrographic profiles obtained with instrumented elephant seals
Fabien Roquet, Guy Williams, Mark A. Hindell, Rob Harcourt, Clive McMahon, Christophe Guinet, Jean-Benoit Charrassin, Gilles Reverdin, Lars Boehme, Phil Lovell & Mike Fedak
02 Sept 2014, doi:10.1038/sdata.2014.28
Scrutinizing the datasets obtained from nanoscale features of spider silk fibres
Luciano P Silva & Elibio L Rech
14 Oct 2014, doi:10.1038/sdata.2014.40