Working towards harmonised peer-review of controlled-access data at human data repositories

Guest post by Viki Hurst, Locum Associate Editor for Scientific Data

Scientific Data is exploring how peer-review mechanisms for sensitive human data can be improved. Here, we outline some of the initial feedback we received from leaders of human data repositories (HDRs), and some innovative alternatives to peer-review.

Data journals, like Scientific Data, stake their reputations on high standards for data peer-review. When it comes to data collected from humans subjects (for example patients, or participants in clinical trials), there can be complex issues surrounding the sharing of these data. These data have to be handled sensitively, ensuring that they are subject to appropriate levels of anonymity or pseudonymisation, and that their distribution does not put the subject at any risk of harm. Safeguards must be put in place to control against attempts at re-identification and other abuses. Access to such data is often limited to verified and trusted users, who may need to sign a Data Use Agreement (DUA) outlining the terms of use, access, and distribution allowed for the data.

These necessary controls raise challenges for anonymous peer-review. How can verifying an individual’s credentials be compatible with an anonymous peer-review process? At Scientific Data, we published a statement two years ago indicating that we would decline submissions where there was no mechanism for anonymous peer-review, and we have had to reject potentially valuable submissions as a result.

While other journals may not actively peer-review sensitive datasets, rising standards for transparent data availability statements are forcing editors to grapple with new questions, such as:

  • What repositories are appropriate for these kinds of data?
  • What kinds of data use agreements are acceptable?
  • Where do our duties to check these elements begin and end?

As a first step in devising solutions for anonymous peer-review of sensitive human data, we consulted with leaders of a selection of HDRs frequently used by our authors. We asked what challenges they face, what peer-review processes they have in place (if any), and for any suggestions of how journals can work with and support HDRs to ensure data quality.

“Allowing editors to access the evaluation results of professionals experienced in detecting disclosure risk and ensuring human subject information is kept confidential may be more effective than placing the obligation on the editors themselves.”
Johanna Bleckman and Maggie Levenstein, ICPSR

Frequent concerns were raised about the appropriateness of anonymous peer-review mechanisms for sensitive human data, indicating that they do not believe that the benefits outweighed the risk. The most significant barrier to investing in these is that many cannot envisage a mechanism compatible with their strict governance processes (it is “impossible to do it blind”), or one that fits within the resourcing they have available.

“The ability of the editor or journal playing a meaningful role in checking the sensitive data after submission to our repository is constrained by the current policy because access requires a research use statement that is consistent with the Data Use Limitation,” writes Mike Feolo of dbGaP, a repository at the US National Center for Biotechnology Information that hosts sensitive linked genotype and phenotype data.

“Reviewers could help by telling us which quality checks they would like to apply to the data, and we could try to add them to our regular QC procedures and share the reports with them.”
Jordi Rambla De Argila, European Genome-Phenome Archive

There were, however, interesting proposals for alternative mechanisms which would require different kinds of journal-repository relationships, such as sharing the results of internal checks made by repository staff, which the referees could scrutinise rather than the datasets themselves. One HDR proposed to establish an external Data Review Board to provide independent perspective and checks on restricted data. To extrapolate from this, we might explore a mechanism by which a pool of verified and pre-approved ‘data reviewers’ at HDRs could be called upon to assess new datasets, expediting the peer review process at journals.

Overall, this exercise highlighted a need to initiate larger and more systematic consultation and to develop creative solutions, which we intend to do in the coming year. As such, it might not be possible to pursue a ‘one-size-fits-all’ approach to streamline this process. We may instead find it necessary to develop several ‘off-the-shelf’ mechanisms compatible with the diverse functionalities of these HDRs.


