At Scientific Data we have been considering how we might develop our scope and editorial policies to better accommodate clinical research data. Publication of clinical research presents a number of challenges, which we will not be the first to have attempted to solve. In particular, we might need to support linking of our primary article type, the data descriptor, to non-public datasets – datasets that cannot be open access due to patient privacy or other legitimate constraints. While we advocate setting the default for research data to open, we are also conscious that full anonymisation of clinical data is often impossible to achieve with certainty.
We want to involve key stakeholders in clinical research data sharing and reuse – from industry, academia, publishing and software providers – to develop policies and content formats that will mutually benefit us all. With this in mind we are holding a working group meeting involving representatives from these different groups, at the Macmillan UK Headquarters in London next week.
A number of events in recent years indicate a strengthening interest in increasing clinical data sharing. Projects such as those led by Yale researchers (the YODA group), the Wellcome Trust (Public Health Research Data Forum) and contributions to the Clinical Study Data Request website show promise, along with increasing options for archiving data supporting research publications – such as figshare and Dryad. We need to connect these efforts with publication of peer-reviewed articles to provide stronger incentives to share data, along with permanence and quality assurance through peer-reviewed publications.
Some previous publisher-led initiatives (e.g. http://www.bmj.com/content/340/bmj.c181) have aimed at increasing open access publication, in journals, of anonymised datasets from clinical research studies. However, publication of underlying data with research articles in clinical research remains relatively uncommon. A pragmatic approach is likely needed, where rich, detailed clinical datasets and clinical study reports could be described or cited in journals, but where the full data and reports are not publicly available. We would need to do this in a way that satisfies the needs of editors and reviewers, who expect sufficient rigour – plus the needs of individuals, organisations, and the law, given the confidentiality of much clinical information. The YODA project is an example of this kind of pragmatic progress. If we achieve our goal with this meeting, we should increase the availability of high-quality information about clinical research data, and also increase publications about datasets which may not yet – or may never – lead to primary research/results articles. This in turn might help reduce the major problem of publication bias towards positive results of clinical trials.
Proposed agenda for our working group
- Providing data access for peer reviewers
- Linking journal articles to permanent metadata records rather than full datasets
- Journal policy consistency with industry and regulatory requirements
- Identifying the data repositories that can support our requirements
- Identifying data/test data and contributors for proof of concept Data Descriptors
- Integration with publication planning process
- Identify other possible barriers to progress
The aim for this group is to be practical, as we have a specific problem to try and solve, and a desired outcome. We want to develop a policy and publication format that will be used by and is attractive to industry and academic researchers, gives authors appropriate credit and ultimately facilitates publication of more reproducible clinical research.
We are, also, seeking to engage companies and researchers with one or more real datasets that could be used to develop a proof of concept – and described in publications – as we develop our policy and content format. We now call, here, to the wider clinical research community to contact us if you are willing to contribute clinical data sets that could help build a proof of concept for clinical Data Descriptors in the future.
Iain Hrynaszkiewicz, Head of Data and HSS Publishing, Open Research