There has always been an element of risk in science, which is why data must be reproducible, explains Ellen Phiddian.
On June 6, 2012, I skipped class to watch the transit of Venus. I was studying in Adelaide, Australia, where the transit lasted from early morning until mid-afternoon and we had a wonderfully sunny day to view it. If I had known a bit more about the history of the transit, I may have been more thankful for that.
In the 1760s, astronomers made long and convoluted journeys across the globe just to observe Venus crossing the Sun. Scientists at the time wanted the transit recorded from as many continents as possible, so they could use the data to calculate the distance between the Earth and the Sun. It took years of effort and huge sums of money to orchestrate such a viewing.
A French astronomer, Guillaume Le Gentil, has a particularly poignant story. He travelled to Puducherry (then Pondicherry) in India to watch the 1761 transit, but was unable to land — while he was sailing, the English had captured the city from the French — and so found himself stranded at sea and unable to measure the transit. Instead of going home, he spent the next eight years in the region, painstakingly preparing for the next transit. He spent time in Mauritius, Manila and Madagascar, working on maps and astronomical observations. Eventually, the French regained Puducherry and Le Gentil made his way there to watch the 1769 event. But he never got to see it. On the day of the transit, Puducherry was cloudy and overcast.
Imagine spending a decade overseas, only to have your work wasted at the last minute by a few clouds. The next transit wasn’t going to happen for more than a century — Le Gentil had missed his chance. He was shattered, and returned to France to find himself declared dead and his estate divided among his heirs.
It was a setback for him, to say the least, but it wasn’t a huge setback for science. Elsewhere the 6th of June was a beautiful day, and other astronomers had dutifully recorded the transit. By the 1770s, we had an accurate estimate for the distance between the Earth and the Sun.
This is why we have to repeat our experiments. We cannot count on one dataset to tell us what we need to know.
Science is the study of the natural world and nature can be very, very fickle. Experiments can go wrong. There are things we can’t foresee, and things we can foresee but can’t prevent. Rarely do we meet with tragedy of Le Gentil’s proportion, but random chance could ruin — or save — anyone’s dataset.
Reproducing data is not glamorous. It’s mostly pretty dull. And it is not always an easy ask to put your methods out into the world so others can replicate them. But it’s critical that we be able to re-test our ideas. If we want any degree of accuracy or certainty, we need batches of results, not just the assertion of one research group.
So much has changed since the 1769 transit. If it had been overcast where I was in 2012, I could have driven for a few hours to find somewhere sunny. Or I could have just watched the transit live online. The culture of science, too, has changed beyond recognition — the 2012 transit was watched by millions of people. Science has become a global institution, and that only makes it more important that our results are reproducible.
We are obliged to be accurate because most of the world has a stake in our results. But we don’t know what our results will look like until we get them. Any single experiment could be a fluke or a failure — it is only with collaboration that we can start making judgements about the world.
Re-test others experiments and allow yours to be replicated. It will come in handy, one day.
You never know when things might get cloudy.
Ellen Phiddian has a BSc in chemistry and science communication. She is now finishing her Honours thesis at the Australian Centre for the Public Awareness of Science. In her spare time she enjoys hiking, fencing and reading a lot of science history books.
This piece was selected as one of the winning entries for the Publishing Better Science through Better Data writing competition. Publishing Better Science through Better Data is a free, full day conference focussing on how early career researchers can best utilise and manage research data. The conference will run on October 26th at Wellcome Collection Building, London.