News blog

Is the scientific literature self-correcting?

A session on scientific reproducibility today quickly became a discussion about perverse incentives. Robust research takes more time and complicates otherwise compelling stories. This turns scientists who cut corners into rising stars and discourages the diligent. It also produces highly cited scientific publications that cannot be reproduced.

The problem of translating academic discovery into drug discovery was discussed in a panel called Sense and Reproducibility at the annual meeting of the American Society for Cell Biology in San Francisco, California.

Glenn Begley, former head of research at Amgen in Thousand Oaks, California, made headlines back in March when he revealed that scientists at his company had been unable to validate the conclusions of 47 out of 53 ‘landmark’ papers — papers exciting enough to inspire the possibilities of drug-discovery programmes. In one study, which has been cited over 1,900 times, not even the original researchers could reproduce the results in their own laboratory.

A big problem is confirmation bias, says Begley. Many quantitative results are built from a series of subjective assessments. “If you’re a postdoc counting the cells, you know you’ll find a difference,” he said. “People will find the answer that the reviewer wants to guarantee publication.”

He listed six warning signs that a paper’s conclusions are unreliable: specimens were assessed without blinding, not all results were shown, experiments were not repeated, inappropriate statistical tests were used, reagents weren’t validated (polyclonal antibodies have limited uses) and positive and negative controls were not shown.

Flawed papers appear regularly in high-impact journals: “Once you start looking, it’s in every issue,” Begley said. Top-tier journals are publishing sloppy science, he continued. “This is a systemic problem built on current incentives.”

What’s more, young researchers are often explicitly discouraged from publishing negative results, said Elizabeth Iorns, head of Science Exchange, a company based in Palo Alto, California, that matches researchers with specialized scientific services. Iorns has set up a programme called the Reproducibility Initiative, in which authors can submit papers for validation by outside groups and gain an extra publication if the results are reproduced. It’s a hard issue for scientists to even discuss, she said. “People become so invested in their results that challenges feel like a personal attack.”

Several researchers made the point that results may be irreproducible not because researchers are sloppy but because only researchers with incredibly specialized skills can make the experiments work. “It’s easier to not reproduce than to reproduce,” said one scientist, adding that she tries to have a nine-month overlap between postdocs, to make sure that there is sufficient time for an experienced lab member to train a newcomer. Another said that groups trying to validate others’ work will be both less experienced and less motivated. “Who validates the validators?” asked Mina Bissel, a prominent cell biologist at Lawrence Berkeley National Laboratory in California.

During the question period, one scientist asked how one could build in a “culture of validation” within individual laboratories. One practical solution was electronic lab notebooks, which could let lab heads conveniently delve into complete data sets and double check the provenance of reagents. Other suggestions were systemic: reducing the size of labs so that investigators could provide closer supervision. And journals should beef up standards too.

One scientist called for journals to be more ready to acknowledge when others bring up problems in a paper. “The self-correcting nature of science depends on understanding that there’s an argument [about results and conclusions].”

Begley described his experience doing clinical research in the 1970s: no one did randomization or blinding, he said. And a scientist could publish in a top-tier journal with only 12 patients. “Preclinical research is 50 years behind clinical research.” If clinical research could become more rigorous over time, he said, so could preclinical research.

Note: This post has been corrected to describe Science Exchange as a company.

Comments

  1. Bernard Carroll said:

    Perverse incentives… turn scientists who cut corners into rising stars… highly cited publications that cannot be reproduced… top-tier journals publishing sloppy science… journals stonewalling rather than acknowledging problems raised by peer readers? Oh, say it isn’t so!

    Actually, among the worst in these respects are the Nature journals. When Robert Rubin and I called attention to ethical boundary issues involving the once rising star Charles Nemeroff in Nature Neuroscience in 2003, Nature Neuroscience and Mother Nature didn’t want to hear about it. They double teamed to stonewall us for months – until we finally took it to the New York Times, whereupon they quickly changed their tune and their disclosure policy for review articles.

    Last year I critiqued a highly publicized article in Nature that was a mélange of basic science fragments somewhat relevant to a clinical disorder. The report contained unsustainable claims of internal replication and improbable diagnostic procedures. Following the directions of the journal, I first shared my concerns with the authors – their response was narcissistic rage and belligerent threat. After a protracted correspondence with authors and editors, a tepid correction appeared, and my own critique was deep-sixed by Nature. None of the original work has since been confirmed and indeed some of it has been disconfirmed.

    When quality scientific standards are disregarded in instances like these then we cannot be surprised that others are incentivized to game the system. The self correcting nature of science does indeed require that editors be open to the blunt critiques they have earned when they publish deeply flawed reports. Today, however, branding, frivolous publicity, and image seem to be the editors’ main concerns.

  2. William Gunn said:

    Hey Monya, thanks for bringing attention to this important issue. I wish I could have been there for the session, as I hear it was exciting, but it seems like many of the in-session comments missed the point. Creating a culture of reproducibility is a great idea, but empowering PIs to investigate research coming out of their lab isn’t going to help, because the issue at hand isn’t fraud, it’s irreproducible results of whatever provenance. Outright fraud is rare and the Reproducibility Initiative isn’t trying to detect fraud, it’s trying to enable researchers who are doing careful work to get a little extra recognition for their efforts, which currently go unrewarded since it’s the highly unusual results as opposed to the highly repeatable results that get all the publishing glory.

    I can understand why people would be worried, but here’s why i don’t think people should be. Most of the work enrolled in the initiative won’t be replicated. This is because the network of research service providers will be reproducing the experiments blindly, explicitly without the benefit of several months of coaching from the originating lab. The reason for this is that we believe that if a result is likely to translate to the clinic, it has to be robust enough that it doesn’t need highly specialized expertise to carry out the work. This means that in the bucket of things that don’t replicate in the Initiative, most are going to be interesting, valid, publishable scientific results that are just not robust enough yet to be replicated by an independent lab due to highly specialized techniques or rare source material or something like that. Maybe some tiny fraction of the results which don’t replicate will actually be due to fraud, and it’s not the aim of the Initiative to try to determine which is not easily reproducible and which is fraud. It’s the individual researcher themself who is in control of the process via self-nomination, so researchers won’t enroll those studies in the initiative to begin with, most likely.

    BTW, Science Exchange is a for-profit company. It’s the Initiative (of which Mendeley is also a part) which is nonprofit.

Comments are closed.