Scientific Data’s call for submissions is fast approaching, and accepted manuscripts will be featured at our platform’s launch in Spring 2014. We hope you’ll want to take part!
If you’re interested in publishing your datasets—and getting credit for them—all the information you need to draft and submit a manuscript is now available on our website (See Box 1, “Getting ready to submit”).
But if you’re still deciding whether Scientific Data is right for you, let’s begin with an explanation of why we need quality data publications, and the role that Scientific Data’s content will play in promoting data reuse.
Box 1. Getting ready to submit
Simply follow these three steps:
- Deposit your data to an appropriate repository.
- Draft the Data Descriptor manuscript. See our manuscript templates and sample Data Descriptors.
- Draft some tables describing your samples, assays, and data outputs. See our guidelines and templates.
Have you come across interesting datasets that you couldn’t reuse, or where it was difficult even to judge the data’s technical quality?
Incomplete and fragmented descriptions are often the culprit; information about the experimental context and data processing steps may be limited or inaccurate. Even in the best scenario, when datasets are in public databases and supported by published journal articles, it can be challenging to identify and bring together all the information needed to reuse the data. The level of annotation provided for each dataset record can also vary from database to database. Moreover, research articles associated with a given dataset focus primarily on the interpretation of the data in light of a specific hypothesis, and cross-referencing between related database records and journal articles is still performed in an ad hoc fashion—meaning that the true reuse potential of even publicly deposited datasets may actually be limited.
How does Scientific Data help you maximize discovery, citations and reusability of your data?
Our new content type, the Data Descriptor, is designed to provide detailed descriptions of life sciences, biomedical and environmental datasets: focusing exclusively on how these have been produced, by whom, and how they could be reused by independent investigators. Data Descriptors combine traditional narrative content (the manuscript) with structured information (the experimental metadata; see Box 2, “The Data Descriptor”, for more information). Published Data Descriptors will link out to both related journal articles and data files stored at data repositories. The structured information will be harmonized by our in-house curators for consistency and to aid searches. In the end, Data Descriptors will be an important bridge in the existing scientific information ecosystem between results-based research and data outputs, and will include value-added elements that serve both human and machine consumption.
Box 2. The Data Descriptor
Each published Data Descriptor will contain:
- Narrative content similar to that in a primary research article, including text sections, figures and tables, designed to enable others to interpret and reuse data, and all formatted for easy readability.
- A structured metadata component, providing specifics on samples or subjects and the methodological steps involved in generating and assaying them, and links to the publicly archived raw or processed data, provided in machine-readable formats that suit data users.
How can you create a Data Descriptor for submission to Scientific Data?
We’ve created two templates that will help you draft an initial submission that meets our editorial requirements.
The first template helps you create the Data Descriptor manuscript, the narrative component that maintains a more traditional article layout, with familiar text sections—Title, Abstract, Background & Summary, Methods—supported by figures and tables. The Methods section has no length limit, allowing researchers to provide enough detail for others to reproduce the experiments and data-processing steps. We are also introducing new sections—entitled “Technical Validation”, “Data Records” and “Usage Notes”—that are designed to provide important information supporting data quality and reusability, information that is often not included in traditional articles.
As part of the Data Descriptor manuscript, we will ask authors to provide detailed information accounting for all samples employed in the study, the assays applied to the samples, and the resulting data outputs. To help authors here, we’ve provided a second template—a spreadsheet—to help you compile this information. The format of this template is designed to make it easy to incorporate this information into the Data Descriptor experimental metadata, the structured component. This will contain information on samples or subjects used in the study, the methodological steps involved in deriving the samples from their source, the assays performed—and links to the corresponding data files in public databases.
More information on submitting a manuscript to Scientific Data can be found in our guide to authors and our submission guidelines. We also encourage prospective authors to look at the two sample Data Descriptors recently released on the Scientific Data website, and the related blog post.
What is Scientific Data’s value-added process for accepted Data Descriptors?
For initial submissions, authors are asked to create a Data Descriptor manuscript and upload it via our online submission system, along with related supplementary files. Peer-review will then be managed by our growing Editorial Board, according to our editorial policies. When a Data Descriptor is ultimately accepted for publication, the curator will help authors create the experimental metadata component of the Data Descriptor. This information is curated for consistency; for example, free text values are replaced with ontology terms, and additional experimental information is elicited from the narrative template and used to enrich this structured component. This harmonized and enriched structured component of the Data Descriptor will power searches and aid navigation across records in Scientific Data and beyond, via the Nature.com platform. Data Descriptors’ experimental metadata will initially be available for download in the ISA-Tab format, and will be available in other formats, such as Linked Data, in the future. Our curation process will scale up progressively as we continue to engage with community standards and work collaboratively to meet the minimal information requirements in various disciplines.
Are you an advanced user or a data service provider interested in supplying metadata directly to Scientific Data?
The Data Descriptor experimental metadata component can also be submitted directly as an ISA-Tab file. Providing detailed metadata directly will speed the time to publication and allow you to include more detailed experimental information in the metadata record. We are eager to work with service providers and data repositories that want to support researchers by providing ISA-Tab formatted metadata export.
If you are an existing ISA user, or are interested in developing an ISA-Tab export from your data service, we are in the process of releasing a detailed ISA configuration file that meets Scientific Data’s requirements. Therefore … watch this space!
We look forward to receiving your Data Descriptor submissions—please contact us at scientificdata@nature.com with any questions.