Data-intensive science requires more laboratory automation and collaboration between different stakeholders, says Daniela Quaglia.
Guest contributor Daniela Quaglia
As computers become more powerful and new technologies are more able to harness the complexity of biological life, data-intensive research is becoming more prominent. As a result, the way in which life scientists deal with data must also change. In particular, it is necessary to approach data collection and storage differently, and collaboration becomes key, both for the initial data gathering, and later for data interpretation. The sooner scientists will be ready to embrace the change, the faster science will continue to progress.
I believe that three main aspects are core to a successful transition to this new world of big science.
Automation of data collection and storage
Data-intensive research can be considered a synonym for labour-intensive research (read lab-slavery – an unappealing concept for many early-career researchers!), and thus data collection becomes the bottleneck of laboratory-based work.
Work-flow optimization in the lab needs to be increased so that more time can be spent thinking about new projects and writing grants. I believe that laboratory automation can help here.
Biological processes can be automated; for example, through the use of liquid handling robots, colony pickers, spectrophotometers fitted with injectors and flow cytometers paired with cell sorters, just to name a few.
Storage and backup of these enormous datasets is the next big challenge. The bigger the dataset, the more space is required, and the higher the cost involved. Researchers are already seeking alternatives by trying to store information in the form of DNA, much more compact and durable than a hard drive, for example.
Collaboration with funding bodies and among peers
Automation and the futuristic data storage solutions can be very expensive, which is why academics need to collaborate more intensely with industry and the government to set up modern automated facilities to carry out state-of-the-art science.
Interdisciplinary projects are, at the moment, the optimal way of securing funding from the funding agencies. Whereas in the past a scientist would most likely focus on one single discipline, it is now necessary for researchers to seek the help and collaborate with experts in different fields.
In a synthetic biology company, for example, production and collection of a large dataset can involve a molecular biologist, an analytical chemist, a process biochemist and a couple of statisticians. Even in the case of a small academic laboratory, the production of big datasets can involve people from very different backgrounds. For example, as a biocatalysis researcher, I am frequently collaborating with computational scientists to generate molecular dynamic simulations, which again generate big sets of data.
The ability to interact with people coming from the most diverse disciplines becomes fundamental (life science, mathematics, informatics, engineering), making versatility one of the most important features for the modern professional scientist.
Collaboration with the public
However, reaching out to scientific colleagues doesn’t have to be the only way. The lay public can also be extremely helpful, and interested, in the work being done by the scientific community.
Many citizen-science projects exist today that enlist the help of keen members of the public. Scientists working on the Genographic Project (by National Geographic) used data collected by the public from all around the world to shed light on the genetic roots of humans. This project involves a team of renowned scientists and cutting-edge genetic and computational technologies. But without the help of the public, this project wouldn’t have happened. Many others are involved with projects like Zooniverse, The Flying Ant Survey, and more.
If you haven’t been involved with big data, yet, you should definitely consider it. The most powerful part of collecting a big set of data is that, once you have it, you can analyze it from many different perspectives and sometimes gain new insights.
But collaboration – the ability to interact with people with different backgrounds from different disciplines – will enhance a scientists’ viewpoint on the work they do.
Data science is exciting, and it will bring us to the future. We speak so much about a globalized world; I believe that laboratory automation, together with scientific collaboration and a little help from the public, will allow data-intensive research to lead the globalization of science.
Daniela Quaglia is a winner of the 2015 Scientific Data writing competition. She is also a research scientist at Université de Montréal (Québec, Canada). Although originally a chemist during herundergraduate studies in Italy, she moved her research focus to biocatalysis and synthetic biology. After completing her PhD at the University College Dublin, Ireland, she moved to the UK to work both in academia (University of Manchester) and industry (London). She also has a passion for art, in particular for acting and writing, and is increasingly interested in science communication for non-specialised audiences.