Nascent

Commenting on scientific articles (PLoS edition)

I’ve been taking a look at the comments left on PLoS ONE from inception until August ‘08 (data courtesy ’http://www.scienceblogs.com/clock/2008/08/postpublication_peerreview_in.php’>Bora). Last week’s crowdsourcing paid off and all of the categorization work gone done really quickly – thank you if you participated! Pedro Beltrao and Lindsay Morgan were the random reward winners and will be receiving some magnificent Nature branded marketing crapola shortly.

plos_breakdown.png

Summary

  • 18% of PLoS ONE papers have reader or author submitted comments
  • 39% if you count comments added by editors (usually reviewer’s comments)
  • Very few comments are of the ‘omg, wow’ variety (as opposed to comments on blogs – this one excepted, obviously)
  • authors are responsible for a high percentage (~ 40%) of user submitted comments
  • 17% of user submitted comments contain interpretation or journal club style precis
  • 13% of user submitted comments are direct criticism
  • 11% are direct questions or requests for clarification
  • These %s are similar to what we saw in the BMC dataset
  • The trackbacks protocol is inadequate for picking up blog chatter about papers

(more below the fold)

Commenting breakdown

plos_no_editorial.png

Descriptions and examples of all of the categories can be found here but in brief:

Comment from author: any communication from the paper’s author. Usually they’re providing corrections, updates and replies.

Comment or correction from editors: reviewer reports, corrections, typo fixes etc.

Annotation on paper: small fragment of text attached to a particular part of the paper (these were hard to identify properly)

Journal club, interpretation or analysis: journal club style discussion. Readers suggesting how the results of a paper might be interpreted.

Requests for clarification: readers asking for more information from the authors.

Direct criticism: readers pointing out possible experimental flaws or errors.

Bonus links or citations: things that could plausibly have been included in the paper. Quite common in bioinformatics papers. Links to datasets, implementations and software downloads.

Spam and Others should be self-explanatory.

Blog trackbacks

As Deepak mentioned in his analysis:

When PLoS launched trackbacks I remember being quite excited, but if there was one area that disappointed me, it was the lack of trackbacks. The numbers are loud and clear here. If you take out trackbacks from Bora [ed: PLoS ONE’s community manager] and other PLoS staff, the number is less than a 100 for all PLoS One papers and a maximum of 4 for any paper. This is a combination of flaws in the trackback system in general (could write a whole blog post on that) and perhaps with the PLoS implementation.

The problem is that the trackbacks protocol sucks ass and it’s hardly ever turned on by default on modern blogging platforms. As a result only a limited amount of blog chatter is being picked up. In a database snapshot of Postgenomic taken last August there are links to 220 distinct PLoS ONE papers from 197 different posts – that’s not including anything written by Bora.

Blog posts are themselves aggregators of user generated content. I checked out the comment threads on a not particularly representative sample of a hundred posts to see if there was any correlation between the number of comments on each post and the number of comments on the PLoS papers that they link to.

Nothing significant came up (correlation coefficient of ~ 0.25), though (a) maybe the sample size was too small (b) it could have been thrown by posts covering papers relating to evolution (like this one) with huge comment threads.

The quality of comments on blogs was generally lower than on papers, the odd gem excepted. Basically I’m not sure that journals are missing out on anything, comment thread wise.

Comparisons with BMC

comparison.png

Take the graphs above with a pinch of salt as the categories don’t match up exactly… for example the BMC analysis didn’t have a separate category for content added by an editor. I think it’s reasonable to say that the proportions of different comment types are roughly similar, though.

Only 2% of BMC papers have comments, versus 18% of PLoS ONE papers. Why the big difference?

Some possibilities:

  • PLoS ONE papers have a higher impact, generally speaking
  • There’s a lower barrier to commenting on PLoS
  • PLoS has a community manager on staff (might explain the high % of author submitted comments?)
  • BMC first allowed commenting back in 2002. In context: the phrase “web 2.0” was coined in 2004. Facebook, MySpace and Firefox didn’t exist. BMC have a backlog of older papers with no comment threads – PLoS launched to a (relatively) web 2.0 friendly audience.

If you took part in the crowdsourcing experiment you may have noticed the same names crop up regularly in the PLoS dataset (JC Bradley, Graham Steel & Bjorn Brembs, looking at you). Are a small hardcore of scientists who comment regularly on different PLoS ONE papers responsible for the high coverage?

I don’t think so. 11% of comment authors on PLoS had left comments on more than one paper. A similar proportion (8%) of BioMedCentral comment authors did the same. The numbers of actual comments by these journal spanning authors are similar too.

PLoS have a marginally lower barrier to entry for leaving a comment. The registration form is shorter and the comment box is very bloggy, whereas the BMC commenting interface requires a more formal registration process, gives you extensive guidance and requires that you declare any competing interests. BMC requires your real name and email address; PLoS uses screen names. There’s a case to be made for both approaches; interestingly the BMC interface is a lot like the BMJ’s, where comments are very successful.

How comments were classified

number_annotations.png

The categories were arbitrary, based on the BMC study and on me checking out a subset of comments in advance.

Only one category for each comment was allowed. In retrospect that was too restrictive and it would’ve been interesting to allow multple categories, but hey. Note that as a result there will be no ‘right’ answer for some comments as they’ll span multiple categories.

Any comments authored by ‘PLoS_ONE_Group’ were automatically categorized as ‘from editors’. That covered 968 of them.

This left 1,411 comments. Users of ploscomments.appspot.com were shown papers at random and asked to annotate each comment in the attached thread. Over the space of last week 10,516 annotations from 818 users were collected. All but one comment had three or more annotations (see graph above – x axis is number of annotations, y axis is number of comments).

To determine consensus and to weed out bogus annotators we ran the annotations through an iterative process.

First, each user was assigned a weighting of 1.

Then:


for each comment

each possible category is given a score of 0

for each user who categorized that comment

increment score of that chosen category by that user's weighting

total score = sum total of all category scores

if any categories have a score >= 50% of the total score then

that category is the consensus decision

every user who chose this category has their weighting increased

every user who didn't has their weighting decreased

repeat loop until we stop finding new consensus decisions

This left 122 comments without a consensus decision.

The crowdsourcing data is available for download in MySQL dump format here. If you get permission from PLoS I can give you the actual dump of comments, too – ask Bora.

Discussion

Basically I agree with these guys:

Why should scientists be all that different from the rest of society? Remember Jakob Neilsen’s 90-9-1 rule (http://www.useit.com/alertbox/participation_inequality.html):

User participation often more or less follows a 90-9-1 rule:

* 90% of users are lurkers (i.e., read or observe, but don’t contribute).

* 9% of users contribute from time to time, but other priorities dominate their time.

* 1% of users participate a lot and account for most contributions: it can seem as if they don’t have lives because they often post just minutes after whatever event they’re commenting on occurs.

So think about the size of the audience for a given paper, then cut that down to 1% as those likely to leave a comment. It’s not surprising really, to see low numbers.

(David Crotty)

What I’d be interested to know is what percentage of traffic to PLoS One goes

1) hit article page

2) click “Download Article PDF”

3) bye.

Because if that’s the majority of the workflow (which would be my guess), that means the reader’s workflow doesn’t include any of the social net tools, since they’re reading in Acrobat Reader. Maybe we need to socialise Acrobat?

(Richard Akerman)

I would say one important issue with ‘discussions around a paper’ (which happen in a broader universe than just Commenting ON The Paper) are that if a reader of the paper is never made aware of the prior discussions and the (hopefully) insightful thoughts, then those thoughts are, to some extent, wasted effort. Each article should of course be the jumping off point to find all comments about it, wherever they are located (and preferably should aggregate those comments to itself).

(Peter Binfield)

Comments

Comments are closed.