Announcing the #scidata16 draft programme and call for lightning talks

We have a galaxy of open research stars giving talks at this year’s edition of Publishing Better Science through Better Data (#scidata16). And if you have a great example of research data sharing or reuse, you could be joining them.

Tickets for the last two events, in 2014 and 2015, went within 24 hours of their announcement and gained wide attention online. This year Springer Nature have partnered with the Wellcome Trust to provide a bigger and better event exploring issues in research data, focused on the needs of early career researchers. The day will include advice on publishing and advancing careers, as well as good practice for data management and presentation. It will also feature tools and resources available to researchers to help them, and society, derive maximum benefit from research data. Prior knowledge of open science, open data and open access are not needed to attend – the event is for anyone interested in carrying out and publishing better research. Continue reading

Enabling the effective sharing of clinical data

This blog was written by Mathias Astell & Iain Hrynaszkiewicz and was originally published on the DNAdigest blog.

The benefits of sharing data generated by researchers have long been understood to be of great value to science (as exemplified by this British Medical Journal piece from 1994). And over recent years there has been a rapid increase in the ability to share and access research data – as can be seen in the rise of data journals (such as Scientific Data and Gigascience), the increase in research data repositories (both general and subject-specific), and the establishment of data sharing policies around the world. Continue reading

Author’s corner: Sharing proteomics data to build community-based resources

Ruedi Aebersold & George Rosenberger photo

{credit}Ruedi Aebersold & George Rosenberger{/credit}

Guest post by Ruedi Aebersold, Professor of Systems Biology with a joint appointment at ETH Zurich and the University of Zurich, & George Rosenberger, PhD student in the Aebersold group at the Institute of Molecular Systems Biology, ETH Zurich.

Mass spectrometry-based proteomics is a data-intense research discipline that primarily aims at identifying and quantifying the proteins that constitute the proteome1. This is achieved by generating large numbers (104 to 106) of fragment ion spectra that represent peptides generated by proteolysis of the respective proteome. Mass spectrometers can operate in different data acquisition modes, referred to as data-dependent acquisition (DDA), targeted acquisition exemplified by selected reaction monitoring (SRM) or data-independent acquisition (DIA)2 exemplified by SWATH-MS3,4. Specific software tools then generate from these raw data processed mass spectra – from which sets of identified peptides, proteins and their abundance are inferred and annotated with metadata. Both, the generation and the processing of such raw data sets are resource and time intensive.  Further, if unique, irreplaceable samples are being analyzed, as is often the case with clinical cohorts the data cannot be re-generated. Therefore, the proteomics community has started to embrace data sharing by the means of different specialized public repositories, for example GPMDB5, PRIDE6, PeptideAtlas7 or ProteomicsDB8. For the last few years, the ProteomeXchange9 consortium has provided centralized deposition of raw data and their meta-annotation. Continue reading

Author’s Corner: Advancing the sharing and standardization of metabolomics data

Mark Viant photo

{credit}Mark Viant{/credit}

Guest post by Mark Viant, Professor of Metabolomics in the School of Biosciences at the University of Birmingham, UK, and Director of both the national NERC Biomolecular Analysis Facility – Metabolomics and the Phenome Centre Birmingham

In 2014, my research team published the first Scientific Data Data Descriptor for metabolomics measurements, Direct infusion mass spectrometry metabolomics dataset: a benchmark for data processing and quality control. This article described in great detail the many steps that are critical for ensuring the production of high quality (direct infusion) mass spectrometry (DIMS) data. It was our intention that this publication would help to establish the benchmark for DIMS metabolomics, derived using best-practice workflows and rigorous quality assessment. The data was also made freely available in the MetaboLights public database for metabolomics data (dataset MTBLS79).1

Continue reading

Data Reuse: An Interview with Daniele Marinazzo

Data Reuse Daniele Marianazzo

Daniele Marinazzo with data from Gorgolewski et al. paper {credit}Daniele Marinazzo{/credit}

Daniele Marinazzo is an associate professor at the University of Ghent in the department of Data Analysis of the Faculty of Psychology and Pedagogical Sciences.

He and his research group focus on methodological and computational aspects of neuroscience research. In particular developing, implementing and validating methods rooted in statistical physics for the study of brain connectivity and activity – investigating how information is stored and transferred in complex networks and how these results are then translated to the brain. Usually their validation of methodologies is done on publicly available data, and the code is always shared.

Daniele became aware of a dataset that could be of benefit to his research through the author of the data posting about it on Twitter. Daniele was then able to effectively utilise and implement this data into his research after the author of the original data published a Data Descriptor in Scientific Data. We caught up with Daniele to find out about his experiences of finding, sharing and using data. Continue reading

Bloggers And Neuro-Tweeps Engaging Recreationally – or BANTER!

Guest post by Helena Ledmyr, INCF

In 2010, BANTER started with a “tweet-up” (which is exactly what it sounds like, a meet-up of people who know each other from twitter) at a bar in San Diego during the annual meeting of the Society for Neuroscience (SfN). Natural born entertainer, @doc_becca, decided to make this a standard feature of the unofficial SfN program and ever since, BANTER has been an increasing success with neuroscience “tweeps” and bloggers. I ran some @symplur analytics for #sfn15 (official twitter hashtag for the main conference) to get this great dataviz on tweeps using the tag from Sept 16–Nov 16. There are more than 40 people among the top #sfn15 tweeps who also attended BANTER, which is a nice correlation. Are you in the picture? Continue reading