Technical solutions: Evolving peer review for the internet

Richard Akerman

Peer review needs to adapt to the pace and volume of information published online

How does the role of peer review evolve when the body of scholarly knowledge expands from slowly circulating, static documents to the universe of rushing, dynamic interactions made possible by the Internet? Although traditional forms of scholarly communication are still used, the sheer volume and pace of information enabled by the Internet and publishing tools such as weblogs (blogs) demands novel solutions.

Often the approach to defending the integrity of information is one of barriers: humans can use computer tools to block low-quality information. The problem is one of scale: for example, one human can manually filter a few spam messages per day, but not hundreds or thousands. Alternatively, rather than fighting the system, we can find ways of making it work for us. Linus Torvalds, creator of the open-source operating system Linux, has given us Linus’s Law: “Given enough eyeballs, all bugs are shallow.” In other words, given a large enough team of contributors checking a document, all the errors can be found.

This ‘wisdom of crowds’ concept is used by Wikipedia, the Internet encyclopedia where web pages can be (mostly) freely edited by any reader. One may think of Wikipedia as a system to generate articles that are resistant to ideological attack and represent a snapshot of the current mass consensus about a particular idea. (See Chris Anderson’s contribution to this debate.)

But this system cannot be a substitute for peer review by experts. A peer-reviewed post-print is fundamentally a static document; it must be so in order that other science can be built upon and around it. In addition, as indicated by Carl Lagoze and Sandy Payette of the Pathways project at Cornell University, Ithaca, New York, there may be multiple quality metrics beyond the ‘OK/not-OK’ of peer review, depending on the audience for the information. “There is a line between the wisdom of crowds and the wisdom of the few that is challenging to draw,” observes Lagoze, although the researchers would like one day to see application environments automatically selecting appropriate ranking schemes.

Separating the elements of publication

While it may not be useful to separate the peer review from the article, one possibility offered by the Internet is to separate articles from the particular journals that publish them. The scholarly workflow has been described as having five elements: registration, certification, awareness, archiving and rewarding1. These services, while presented to authors as a unified whole, are actually a combination of capabilities provided by the publisher itself and external services where the publisher acts as an intermediary with third parties. Each of them has historically been tightly connected with journal publication, but it need not be.

Institutional repositories or librarians, for example, could take on some of the intermediation roles. Part of the awareness role involves providing links and unique identifiers such as Digital Object Identifiers (DOIs) for the article and data codes that may be specific to a particular scholarly field. The future may also see data ‘DOIs’ (the German Science Foundation’s ‘Publication and Citation of Scientific Primary Data’ project, for instance), and unique identifiers for authors (for instance, Elsevier’s Scopus Author Identifier or the Dutch Digital Author Identification project). Archiving of articles may be done in-house, or through a third-party provider such as Portico.

For the certification role, the current system of peer review has enduring value, ensuring that an article passes certain standards of scientific quality and integrity. It requires considerable knowledge and expertise, as well as a wide base of contacts within academia to be able to select appropriate reviewers. But the article itself could live an independent life on web pages or in institutional repositories without ever being published in a journal. Since a blog is fundamentally a publishing technology, might a scientist’s blog be the authoritative source for his or her academic output? An article or blog entry submitted to, and passed by, a stand-alone peer review service might be recorded in a public registry, or be digitally signed as part of the certification process.

But then, without the journal fulfilling the awareness role, how can scientists track articles of interest to them? And how can we measure the impact of an individual article? The life-sciences website Postgenomic scans science blogs and ranks articles according to how much they are being discussed, not unlike traditional citation rankings. Euan Adie, a bioinformatician at the University of Edinburgh, explains: “The main thing that inspired Postgenomic was an increase in the number and quality of science blogs over the past two years. Things that are working well include the way comments on papers are collected and collated: from a link to a paper in a blog post we can automatically find the DOI or PubMed ID, which lets us collect metadata about the paper from the relevant database and present comments in an organized fashion. Postgenomic indexes a relatively small number of blogs – about 200 – and at the moment we collect roughly 30 comments on 20 papers each day.

“Citation ranking doesn’t really work with data from science blogs at the moment because bloggers tend to write more about controversial papers or those with universal appeal, so top-ranking papers aren’t necessarily the ones with good scientific content.” It is a challenging area for research to understand how online discussions reflect scientific discourse. The ranking of citations will reflect the sub-set of the scientific community that is online, and their particular interests and concerns.

Taking a slightly different tack, uBioRSS (“a taxonomically intelligent feeder reader”) helps scientists discover discussions based on taxonomic categories or specific organism names, with a lot of intelligence behind the scenes to reconcile various terminologies, from archaic names to common names.

These sorts of meta-services build on discussions already flourishing in the science blogosphere. Using such tools, scientists can discover, track and participate in debates on topics of interest to them, regardless of venue. In fact, the more discussion there is, the more data there are about the value of the article. Having these discussions in open fora may also allow for better capture and archiving of all related information during the lifecycle of article creation. Although lively scientific communications have always existed, we may find that much wider discussion about a particular article takes place on the Internet, both in the pre-print stage before certification, and once the article has passed peer review.

The peer-review stage will continue to be essential for ensuring that the body of science grows through real, supported discoveries and assertions. Peer review excludes damaging misinformation while adapting to new inputs. In the background discussions conducted for this article, I found an underlying deep respect for the value of peer review, which ultimately is an extraordinary service provided for free by scientists to the scientific community and society as a whole.


1. Van de Sompel, H., Erickson, J., Payette, S., Lagoze, C., Warner, S. Rethinking scholarly communication: building the system that scholars deserve D-Lib Magazine, doi: 10.1045/september2004-vandesompel (2004).

Richard Akerman is a technology architect and information systems security officer at the Canada Institute for Scientific and Technical Information (CISTI), Canada’s National Science Library. He blogs about topics related to his work at Science Library Pad. Any opinions expressed in this article are his alone, and do not necessarily represent the views of CISTI.

Read more See this article in Nature’s web focus here


Comments are closed.