The (highly abbreviated) life story of a paper appearing in Nature often goes something like this: ideas are birthed and experiments envisioned. Pilot experiments are run, yielding beautiful preliminary data. Replication and controls are then gathered over the course of months, if not years of hard labor. The paper is written, submitted, and reviewed. A few (two is typical) rounds of review and revision later, it is published (with highly variable degrees of reviewer and editorial unanimity). But this is by no means the end, rather, just a milestone in the evaluation process by the community. In journals, post-publication evaluation has traditionally occurred in the form of peer-reviewed follow-up papers or formal commentary. This may change someday as alternative forms of scientific publishing are explored, but for today we’ll talk about a formal addendum we’re publishing on a 2011 paper by Cathy Price and colleagues and invite you to add to the discussion.
A few months ago, we published the original paper describing structural and functional changes in the teenage brain that accompany changes in IQ over time. The authors measured verbal and nonverbal IQ twice, once in early and once in late adolescence, and also scanned the subjects at both time points. They reported that changes in gray matter density in left motor cortex and anterior cerebellum were correlated with verbal and nonverbal intelligence, respectively. Although structural correlations between brain areas and IQ have been reported before, this paper was unusual in that the authors were able to track the same subjects over time. From the outset, the reviewers were positive about the potential contribution of this study as it not only demonstrated considerable plasticity in IQ during adolescence, but provided evidence of distinct neural substrates for different forms of IQ. Through the course of several rounds of review, various technical issues about statistical analysis, interpretation, and relationship to previous literature were discussed. After publication, we received a formal submission to our Brief Communications Arising section (an online-only section for formal exchanges) raising a concern about statistical inference that was not brought up in the review process. The concern related to the authors’ estimates of how much of the IQ change could be attributed to gray matter changes. The commentary asked whether the reported effect sizes could be potentially distorted by how the analyzed brain areas were selected. Several published commentaries and perspectives have pointed out that using the same dataset to pick out which data to analyze and then analyzing only that dataset can lead to inflated effects. This form of sampling bias, termed a variety of things (including circularity and double-dipping), is common in, but certainly not exclusive to, systems and cognitive neuroscience studies. The concern with this particular paper was that if the relationship between gray matter and IQ was assessed in the MRI voxels selected on the basis of their showing a change with IQ, those effects could be inflated.
We were aware that this was a controversial issue (some of the commentaries created quite a stir when they were published a few years ago). We had heard from many in the community that the practice could potentially invalidate results, but delving more deeply into the issue we realized that there were also other views both on the potential severity of the problem and validity of the practice.
It seemed to us that criticisms should be applied and taken in context and we felt that for this paper, the magnitude of the effects were a key part of the conclusions of the paper – in the field of biological correlates of IQ, the amount of variance accounted for by various factors has been controversial. After consulting with an independent referee (not one of the original reviewers of the paper) we asked the authors to provide a reanalysis that would address the criticism by recalculating their effect sizes using independent data. After several rounds of back and forth between the authors, the reviewer, and ourselves (during the course of which we asked the authors for clarifications of their methods, alternative analyses than the ones they initially provided, and also for an additional analysis to address a different technical concern not raised in the commentary, but flagged by the reviewer) we eventually agreed upon a response that we felt appropriately addressed the concerns. At this point, the authors of the Brief Communications Arising decided to withdraw their commentary.
Nonetheless, we felt that the commentary had raised concerns likely shared by other members of the community (and indeed we had heard informally from others) and that the reanalysis provided clarification of how much the conclusions of the paper were affected by those concerns. We therefore decided to publish the response with a few tweaks to the language as a stand-alone addendum, which you see here today.
A common sentiment we hear regarding the formal corrections/commentary process is that it is frustratingly lengthy and time consuming, which seems anachronistic given the speed and ease with which comments can be communicated on the internet. Indeed, it did take several months to complete this set of communications, due to the many rounds of correspondence and requests for clarification. When the issues are technical and potentially call for revision of published conclusions, it takes time for everyone to present their side of the story and for everything to be evaluated. But to address the need for open communication on a faster time scale, and to host exchanges that we do not feel rise to the level of formal correction, we have an online comments section attached to every paper. At Nature, as with many other journals, these sections are sparsely populated, but they do exist and we want to hear from you!