With Big Data come big problems and big responsibilities. Digital Science recently ran a #datadramas tag on Twitter, asking scientists about their own data dramas. It’s scary, but what it and the Naturejobs poll show is that many scientists still use the laptops and USB sticks to store data long term. To talk about this, we spoke to Susanna-Assunta Sansone, the associate director of the University of Oxford e-Research Centre, and a data consultant and honorary academic editor for Scientific Data, a new open access data publication by Nature Publishing Group that launched this week.
The first question to tackle is: what is big data? Sansone says people often only mean size and volume, but from her point of view big data is also about “variety and complexity. So, data is multidimensional. You have video, audio files, text files, you have physical specimens which you have recorded information about.”
Sansone is a biologist by training, and now works with life sciences data, of which there is an incredible amount, especially within the genomics fields of research. How scientists manage all this data varies on the data types, says Sansone. “There are different tools for different data types. And there are different enablers like terminology or format, which work for different data types.” If you are a newcomer to the field of life sciences, this can be incredibly confusing. There are some general tools that are available, and the one that is used most is Microsoft Excel. “It’s better than nothing, but there are better tools nowadays.” Continue reading









