Peer-to-Peer

Technical solutions: Wisdom of the crowds

Chris Anderson

Scientific publishers should let their online readers become reviewers.

Who are the peers in peer review? In journals such as Nature, they usually have a PhD and work in a field relevant to the paper under consideration. If they are academics, they may be tenured professors, usually people on a relatively short list of experts who have agreed to review papers. This is a little élitist, but credentials such as PhDs and tenure are given in part to reward those things – experience, insight, brains and the respect of other researchers – that also make for wise advice. The process is not perfect, for reasons ranging from cronyism to capriciousness, yet long experience has shown it to be better than the alternatives.


But now a new kind of peer review is emerging online, outside the scientific community, and it’s worth asking if there are lessons for science. In the Internet age, ‘peer’ is coming to mean everyman more than professional of equal rank. Consider the rise of ‘peer-to-peer’ networks and the user-created content of ‘peer production’, such as the millions of blogs that now complement mainstream media.

Perhaps the most dramatic example is Wikipedia, an online encyclopaedia written and edited by more than 100,000 volunteers. It is, says its founder Jimmy Wales, not so much anti-élitist as it is ‘anti-credentialist’. Wikipedia’s editors don’t have to have PhDs or any sort of professional affiliation; their contributions are considered on their merit, regardless of who they are or how they have become knowledgeable. If what they write stands up to inspection, it remains; otherwise it goes. Everyone – readers and contributors alike – sees the same ‘Edit this page’ button, inviting correction and amplification. Aside from very obscure entries and very controversial ones, this usually results in a continuous improvement in quality, as clear prose drives out muddy phrasing and many eyes fact-check assertions.

Ruling this world of open publishing is the egalitarian hierarchy of Google, which establishes relevance and merit based on incoming links from other sites. The authority of a citation is determined by how many others cited it. It is citation analysis ‘crowdsourced’ to the collective wisdom of anyone with a computer. It sounds like chaos, but in reality it’s the closest thing to an oracle the world has ever seen. It uses a form of peer review, but not one that a scientific journal would recognize.

Two types of ‘peer’?

Are these two different uses of ‘peer’ – in science, a professor; online, anyone – just a case of sloppy semantics, or could the wisdom of online crowds be a model for scientific review? The answer depends on what you want. As it stands, peer review in journals primarily decides if a paper is novel and interesting enough to satisfy both the space constraints of a printed publication and the journal’s mission – less ‘Is it right?’ and more ‘Is it right for the readers of journal X?’ Of course, peer review can also catch errors and suggest improvements, but its main role is to give the journal’s editors a thumbs-up or -down.

Online, plenty of websites let their readers decide what makes the front page. Slashdot and Digg, for instance, are remarkably vibrant and useful news services where users submit stories and others vote them up or down. Readers can also comment on the stories, providing a real-time counterpoint that is often as illuminating as the original article.

But although this online method feels a bit like peer review, it’s not yet a substitute for the current process of publishing original scientific research. For starters, stories submitted to these collective news services have already been published elsewhere – the aggregators serve mostly as reading clubs that filter the news for interesting nuggets, rather than determining what should be published.

Yet there are elements of this model that could work for science. Scientific peer review is typically a process of ‘pre-filtering’ – deciding which of the many papers submitted should be published. By contrast, Digg is a ‘post-filter’, deciding which of the many papers published are most interesting to a group of readers. Today, scientific publishing does the first kind of filtering pretty well, and the second hardly at all. Word of mouth aside, citation analysis, tracked in databases such as ISI and Scopus, is the only good way to determine what the collective wisdom has decided are the most important papers, and this takes years to emerge. Is there a faster and better way?

Publishing experiments

Several online publications are trying to find out. Naboj aims to be a post-filter on physics papers published on the ArXiv online repository of physics preprints. It currently consists only of readers rating papers on a five-point scale, but the rankings are better than nothing. Philica, an online journal started earlier this year, goes further: it publishes any paper submitted, but ranks them based on open peer review by any reader.

The free online Public Library of Science (PLoS) journals are planning to extend this model by adopting conventions from the blogosphere: an open comment area for each paper, ‘trackbacks’ that show which sites are linking to it, and perhaps a reader ratings scheme. Michael Eisen, a genomics researcher at the Lawrence Berkeley National Laboratory and one of PloS’s founders, says the hope is to capture some of the post-publication wisdom already found in academia, but rarely accessible to others.

PLoS still uses expert researchers to review papers before publication, but the editors realize that these scientists often have little time to really dig into a paper. By contrast, readers of a paper after publication may also have an opinion, and many (especially graduate students) have the time to evaluate the paper in depth. The online environment means there’s no reason not to record it.

Such a record would have the effect not only of continuing peer review after publication, but also of making it easier to find important work in a blizzard of papers – they’re the ones that are being buzzed about. It is also easier to ignore poor work that slipped through peer review – these are the papers with the withering comments and poor ratings.

Best of all, such an open peer-review process taps into something that already exists: journal clubs. Every day, thousands of researchers and students are discussing the latest papers, but their insights and opinions are not recorded and shared widely. This information needs only to be collected, organized and distributed to become far more useful. It’s now possible to tap such collective intelligence online by doing to scientific publishing what the web has already done to mainstream media: democratizing it.

So the rise of the online ‘peer’ has shown that there is another way of tapping collective wisdom. But it’s not going to eliminate traditional peer review anytime soon. The reason why can be explained in the economic terms of scarcity and abundance. Closed peer review works best in scarce environments, where many papers fight for a few coveted journal slots. Open peer review works best in an abundant environment of online journals with unlimited space or the post-publication marketplace of opinion across all work.

In the scarce world of limited pages in top journals, prestige is earned through those journals’ high standard and exclusivity. That comes, in part, from the process, which involves impressing the very discriminating combination of an editor and a few respected researchers. Defining ‘peer’ relatively narrowly is part of the game. It’s not always fair or efficient, but in a world ruled by reputation, having successfully run that gauntlet is proof of at least some kind of fitness.

But in the abundance market of online journals or that of post-publication filtering, where each paper is competing with all the other papers in its field, it’s more sensible to define ‘peer’ as broadly as possible, to maximize the power of collective intelligence. In that market, prestige is just one factor in many determining relevance for a reader, and the more filtering aids that can be brought to bear, the better. From that perspective, these are exciting times. The experiments of Nature, PLoS journals and others will reveal where and how these techniques work best. But Wikipedia and Digg have already demonstrated that they do work.

Chris Anderson is the editor-in-chief of Wired magazine and author of The Long Tail: Why the Future of Business is Selling Less of More, to be published in July 2006.

Read more See this article in Nature’s web focus here

Comments

  1. Report this comment

    Eddy Deschagt said:

    RE : Technical solutions: Wisdom of the crowds, by Chris Anderson

    The problem with the wisdom of crowds is that crowds usually shout plenty of words but are short on science and math.

    Some crowds fail to be wise, and then are derisively called mobs.

    How can you figure out which crowds fail ? As an engineer, I start by making some assumptions and back-of-the-envelope calculations. Nobody is alway right. So, let’s call someone who is right 90% of the time an expert. A Joe Average is only right 60% of the time. If there’s more than one Joe, they can each make up their own minds, and then go for the majority answer.

    Sometimes this majority answer will be right, sometimes wrong. Start figuring out the odds for three Joe’s, five Joe’s, … This is were the math starts – binomial distribution. By the time you get to about thirty Joe Average’s, their majority answer has a better chance of being right than the expert.

    Next you can play with the numbers. What happens when Joe Average is out of his depth, and each Joe only is right 30% of the time? The chance of the majority answer of thirty Joe’s being right dives to the 1% region. Oh my …

    So, for the wisdom of crowds to work you need a big enough supply of educated Joe’s. Scientific studies that incrementally advance the available knowledge will be adequately scrutinized. If the crowd of independent Joe’s is hundreds or thousands strong, the quality of post-filter review will be easily surpass current peer review.

    The problem is for studies that push the scientific paradigms. When even experts are more often wrong than right, post-filter review will bury such studies.

    Kind regards,

    E Deschagt

  2. Report this comment

    Maxine Schmidt said:

    E Deschagt raises an important point— we need a big enough supply of educated Joes— and Janes. And not just conventionally educated, but able to understand the scientific method and how scientists think. As a lover of language and the larger life of the mind, I believe we need conventionally literate scientists as well. Critical thinking skills are more important now than ever.

  3. Report this comment

    David Thomson said:

    Open peer review is the answer, especially when the theory presented is supported by mathematics and agrees with empirical data. The real issue behind the call for an open peer review is to overcome preference toward a specific paradigm.

  4. Report this comment

    Jim Bourassa said:

    The real solution that I see to the problem of peer review is a two-step approach, a new, open peer review and the traditional, expert peer review, that filters manuscripts for publication in a specific journal. Both systems can generate useful suggestions on how the manuscript could be improved. It would be similar to how a bill is passed in Congress, and be an efficient way to process the large volume of papers that are presented each year.

  5. Report this comment

    Bojan Tunguz said:

    RE : Technical solutions: Wisdom of the crowds, by Chris Anderson

    I am the creator of the Naboj Dynamical Peer Review website mentioned in this article. I would like to make a couple of corrections/clarifications about Naboj.

    1) Naboj is a fully-fledged peer review site, which means that users can write full-length reviews of arXiv preprints, not just the five-point evaluation. However, only registered users can avail themselves of this feature.

    2) One of the features that users have is to evaluate usefulness of posted reviews. The hope is that with many

    users there will be a convergence towards the higher usefulness and quality of the reviews themselves.

    I have several other features in the pipeline, but they will only become necessary and useful if there are enough users that regularly come to the site.

    Regards,

    Bojan Tunguz

  6. Report this comment

    Wes Groleau said:

    The problem is for studies that push the scientific paradigms. When even experts are more often wrong than right, post-filter review will bury such studies.

    There’s another factor: “revealed truth” or “everybody knows”. When thousands of “peers,” even if educated, are simply unwilling to consider a hypothesis, how can a few believers (right or wrong) in that hypothesis hope to be heard?

    In a forum like wikipedia, if one of these “believers” edits an entry, one of dozens, maybe hundreds, of others will replace his or her words with the “party line.”

  7. Report this comment

    Diane Nahl said:

    The statements about Wikipedia as a good example of the new ‘everyman peer review’ are misleading. I have tested their process of submitting articles on topics where I am an expert and have experienced their complete contempt for credentialed experts of any kind.

    My articles were deleted by the so-called analysts even though the facts and views could be checked and were true and correct. I regularly publish in peer-reviewed journals and publishing in Wikipedia would degrade my vitae—this was an experiment to discover how they (mis)handle factual information from credentialed experts.

    Theirs is not a useful peer review process but represents only the bias of the non-credentialed overseers for Wikipedia. They are actually controlling what information is ‘acceptable’ through an orientation of disdain for expert knowledge.

    Wikipedia is also full of plagiarized information, cut and pasted from authoritative and non-authoritative Web sites and databases, uncited and not referenced to the originals. Much analysis by outside reviewers has uncovered this pervasive dis-information practice. Nevertheless, people are committed to loving Wikipedia though it is riddled with misinformation and unsubstantiated bias.

    Everyman peer review is an example of the disintermediation trend that has swept computer science, law, medicine, publishing, and library and information science in the past few decades. It is a beneficial trend because it stimulates creativity and allows people to expand their influence and knowledge at will. It is troubling because standards and quality suffer and more junk is produced.

    This trend leads to a new need for ‘everyman’ to develop critical analysis skills to determine which is useful and authoritative and what to ignore and toss.

  8. Report this comment

    W. Gunn said:

    Mr. Anderson is a great optimist, and I really like the idea of the Long Tail. However, we must remember that being a techno-utopian is his job. He cites three sites as exemplary of the wisdom of collective filtering, and while each does serve as a proof-of-principle, there are serious concerns in real-world implementation.

    First, He cites Digg and Slashdot as examples of great sites for discovery of popular articles. Has he even been to the sites lately? I used to get value from Slashdot, but not lately, and rarely ever from Digg. My experience is that they’re great sites for finding popular articles, but not necessarily for good articles, and in this case there’s a important qualitative difference. To understand what I mean, go look at the comments to any given Slashdot or Digg post. People talk about the good old days of Slashdot, when you could expect a certain level of competence and intelligence from the average reader, and they all say it has gone to hell in the past couple years. This makes Mr. Deschagt’s point about the amount of novices required to be as smart as a few experts extremely clear. I like having editors and a select cabal of experts pre-filter because it sets a certain minimum level of quality. Even with these standards in place, there’s a lot of crap that gets published. The answer might be to publish it all and let the cream rise to the top, but the Digg model isn’t the way to do it unless there’s some way of maintaining a certain minimum level of competence as the userbase grows.

    Wikipedia isn’t a great example either. The Register is full of stories about the unforeseen consequences of relying to heavily on Wiki-based information. If a student relies heavily on a inaccurate page on Wikipedia, the worst he’ll get is a bad grade. Scientific publishing directs the spending of billions of dollars a year, so there’s much more incentive to get it right, and also more incentive to seed biased information which may not be recognized as such by a non-expert, particularly in non-controversial, highly-specialized niches.

    The Google approach does make a good deal of sense, because it incorporates a trust metric in the form of PageRank. However, this approach isn’t without its dangers either, as the constant struggle of Google against blog-spam indicates. It should go without saying that instituting a PageRank-like system woudn’t be a great change from the current system, considering how closely guarded and secret the company holds the inner workings of PageRank.

    And do you really want to read the same 500 cranky comments about stem cell ethics every time a ESC paper is published?

    In summary, there is a role for post-filtering of articles, but the current implementations show only proof-of-principle and leave serious concerns for application to real-world high-stakes problems.

Comments are closed.