Making sure others can do your experiments doesn’t just help them — it’s good for you, too.
Publishing Better Science through Beter Data writing competition Jonathan Page
A core tenet of science is reproducibility: the results of one scientist must be able to be reproduced by another, lest the findings be dismissed as a fluke or even fraudulent. In today’s data-driven realms of research, ‘reproducibility’ doesn’t simply mean publishing methods, many journals now require that datasets, and the code used to analyse them, be published too. This requirement ensures that both data, and methods, can be scrutinised. If other researchers can’t reach the same results, the study will need to be treated with caution. In doing this, scientists avoid damaging their reputation by publishing flawed studies, and journals avoid publishing bad science. It’s a win-win situation.
So why don’t scientists always work reproducibly?
The problem lies in the fact that working reproducibly often requires some time investment, something which many scientists working in competitive fields claim they can’t afford. Florian Markowetz from the University of Cambridge counters these claims by saying “not to ask what you can do for reproducibility, but to ask what can reproducibility do for you!”
In his keynote presentation at the Publishing Better Science through Better Data conference held at the Wellcome collection, London, this October, Markowetz explained that we can all benefit from working reproducibly. The most obvious reason is that researchers must be able to explain how they reached their conclusions. As Markowetz puts it, “a project is more than a beautiful result. If you can’t explain how you got [from the data to the final conclusion], then a miracle has occurred.” A good scientist must avoid miracles in favour of credibility.
It’s not just scientists from other institutions who need to understand your methods; people within your lab will appreciate it too. If you leave the university, lab, or move onto another project, how do you ensure someone else is able to take on your work? “What do you do if you haven’t documented your analysis?” says Markowetz. If you don’t comment your code and flag your folders, it’s going to slow the research while someone delves into the depths of your old hard drive. The same could happen to you if you come back to a project and find you’ve forgotten where you put that final analysis. Keeping track of what you’ve done will help to make sure your work reaches its top potential.
If none of that sounds convincing, surely the draw of CV enhancement does. Documenting the process of your project and putting it online can allow others to see exactly what you’ve done: your thought processes, your work and your achievements. If you’ve written code that can be used to analyse other datasets, why not put it online so other people can use it? This might seem selfless, but that doesn’t mean you can’t benefit from it. Markowetz and his lab do this frequently: “It’s part of my CV, part of the lab’s research output.” It’s hard not to be impressed by a tool used by hundreds of other researchers, and you’ll reap the rewards if it has your name on it.
Everyone, regardless of their career, should aim to work reproducibly. “Even reporters’ work should be reproducible” says Markowetz, “if you report on a story from somewhere, it’d better be true! If others go to the same place and can’t reproduce what you found, that’s not going to look good for you.”
If you can’t explain how you did something, even the most impressive claims on your CV are going to be ignored. There are few things more uncomfortable than floundering your way through an explanation whilst superiors or potential employers listen with increasing scepticism. We should all work reproducibly whatever our field, not just for the sake of others, but for the sake of our careers. Miracles don’t happen, and as long as you work reproducibly you won’t need to cite divine intervention on your CV.
Jonny Page is a PhD candidate at the University of Oxford, where he’s studying the biomechanics of insect flight. Outside of this, he has interests in data science, programming and science writing.
You can access all the slides and videos from Publishing Better Science through Better Data 2016, as well as the great visual summary of the day, on the event website.
Suggested posts
Why should we work so hard to make our work reproducible?
Why don’t scientists always share their data?
Recent comments on this blog
African astronomy and how one student broke into the field
From Doctorate to Data Science: A very short guide
Work/life balance: New definitions