Technical solutions: Certification in a digital era

Herbert Van de Sompel

What functions do we take for granted in print?

The Digital Library Research and Prototyping Team at the research library of the Los Alamos National Laboratory conducts research on various aspects of scholarly communication in the digital age, including peer review. Our research attempts simultaneously to analyse properties of the existing review system, and to formulate feasible alternatives.

A core inspiration is that the digital environment allows for (indeed, requires) systemic changes in scholarly communication procedures. This potential for fundamental change is related to two properties of the digital environment that were unavailable in the paper world. First, the core functions of our scholarly communication system can be separated (at least theoretically) in the digital environment (ref. 1). Second, we will be able to record in a machine-readable form, then aggregate, and later data-mine the collection of events of this system.

One line of investigation led us to define a framework in which peer review is an autonomous service overlaid on scholarly repositories hosting unreviewed manuscripts, with the repositories and reviewing services linked together for an integrated view of the distributed information (ref. 2). Another effort automatically identifies possible reviewers based on extractable information from the digital environment, such as a manuscript’s subject area and citation pattern, and the existing body of literature in the subject area (ref. 3).

A recent effort relevant to Nature’s current debate attempts to analyse reviewer behaviour systematically. We have found that, when asked to express preferences for conference papers to review, potential reviewers are only marginally driven by their level of expertise in a paper’s subject domain as measured by their own publications in that domain. This suggests that, if left simply to a reviewer’s choice, papers might not receive the expert review they deserve (ref. 4).

At this point in our research, we can only speculate which factors may be most significant in shaping reviewer preferences. From our own experience, the factors may include short-term reviewer interests; lack of time to express preferences in a focused manner; the rather common inability to assess one’s true expertise objectively; the desire to competitively control publications from specific individuals and groups; and possibly the fact that the most qualified reviewers will inevitably be most related to the paper’s authors and hence be limited by concerns about conflicts of interest.

Roosendaal and Geurts distinguish the following functions that must be fulfilled by every system of scholarly communication (ref. 5):

Registration, which allows claims of precedence for a scholarly finding.

Certification, which establishes the validity of a registered scholarly claim.

Awareness, which allows participants in the scholarly system to remain aware of new claims and findings.

Archiving, which preserves the scholarly record over time.

Rewarding, which rewards participants for their performance in the communication system based on metrics derived from that system.

The limitations of a paper-based scholarly communication system have led to a vertical integration of all the functions into the traditional journal system. A journal publisher records the registration date as the date the manuscript was received. The peer review process, conducted under the auspices of the journal publisher, certifies the claims made in the manuscript. The published article fulfills the awareness function. Rewarding is based on achieving publication in prestigious journals and on being cited by other scholars. Finally, in the paper-based era, libraries archive the published article, bundled into a journal issue, by shelving it.

The digital, networked environment allows for the functions of scholarly communication to be individually implemented by multiple parties in different ways, and then combined as alternative or companion services in what can effectively be regarded as a network-based value chain.

This may seem far-fetched, but examples of such deconstructed value chains are already emerging. For example, in the physics community, ArXiv fulfills the registration function. Many manuscripts submitted to ArXiv eventually end up in well-established journals fulfilling certification through traditional peer review. Or they end up in so-called overlay journals, such as Advances in Theoretical and Mathematical Physics that provide some form of certification by listing manuscripts in an issue.

Add to this the ability to track the use of digital manuscripts, and even to aggregate such usage information across distributed systems (ref. 6), combined with the intriguing correlations between readership counts and citation counts (ref. 7), and one can imagine the emergence of some new facet of certification provided by readership information. Collaborative environments in which peers can grade each other’s manuscripts (like viewers rating movies in the Internet-based Netflix movie-rental system) could provide yet another certification indicator.

Although many questions about the exact meaning and validity of such parallel certification avenues remain unasked or unanswered, the prospect of a multifaceted perspective merits some attention. Especially intriguing is whether automatically extractable quality metrics could be used as parallel facets of certification. If so, could such metrics be less prone to deficiencies resulting from the human-driven nature of peer review? Understanding that most, if not all, papers get published somewhere, could such automated metrics be more indicative of the actual quality of manuscripts?

In the same way that the impact factor is just one of many possible quality metrics for journals, peer review as we know it represents only one facet of certification. If nothing else, our research on certification has already shown that various properties of the peer-review mechanism, as it emerged in a paper-based communication environment, should not necessarily be taken for granted in the emerging digital environment.


1. Van de Sompel, H., Payette, S., Erickson, J., Lagoze, C., Warner, S. D-Lib Magazine 10, doi:10.1045/september2004-vandesompel (2004).

2. Rodriguez, M.A., Bollen, J., Van de Sompel, H. J. Inf. Sci. 32, 149-159 doi:10.1177/0165551506062327 (2006).

3. Rodriguez M.A., Bollen, J. Preprint at (2006).

4. Rodriguez, M.A., Bollen, J., Van de Sompel, H. Los Alamos Technical Report LA- UR-06-0749 Preprint at arxiv:cs.DL/0605110 (2006).

5. Roosendaal, H., Geurts, P in Cooperative Research Information Systems in Physics (Oldenburg, Germany, 1997).

6. Bollen, J., Van de Sompel, H. Preprint at (2006).

7. Bollen, J., Van de Sompel, H., Smith, J., Luce, R. Inf. Process. Mgmt 41, 1419-1440 Preprint at (2005).

Herbert Van de Sompel is the team leader of the Digital Library Research and Prototyping Team at the research library of the Los Alamos National Laboratory. His research is concentrated on scholarly communication in the digital age.

Read more See this article in Nature’s web focus here.


  1. Report this comment

    Sean O'Hagan said:

    Has anyone examined a more distributed manner of certification, modeled on such web projects as SETI@Home, where chunks of data are sent to users in a network? This model might work in the fields of mathematics or computer science (perhaps others). A paper is broken into small parts, and sent to “subscribers” in the mathematical community. These small parts are verified, and returned. This model would require that authors write their papers in a more structured and organized fashion – perhaps more difficult for them, but much more useful for the community.

Comments are closed.