News blog

BioTorrent aims to open data sharing floodgates

A new website that allows people to share scientific data using the principles that underlie hugely successful peer-to-peer systems has been unveiled by University of California, Davis scientists, who present the tool this week in the open access journal PLoS ONE.

Using BitTorrent file-sharing technology, BioTorrents transfers files rapidly by sharing bandwidth among many users across multiple institutions.

“Before BioTorrents, there wasn’t a really good way for any researcher to share and easily distribute their large dataset or results instantly,” says lead author of the new paper, biologist Morgan Langille. As datasets get larger and demand for open access increases, current methods that don’t scale well for large files cause long transfer waits.

For example, in a single month, National Center for Biotechnology Information users download datasets from 1,000 Genomes (8981 GB) around 100,000 times.


Co-author and biologist Jonathan Eisen’s lab sequences genomes of microorganisms and gets constant requests for raw and processed data. “It would be more convenient for us and probably for other people if we made that available through a site like this,” he says.

BioTorrent works like Napster from the turn of the millennium and current systems, which have sometimes attracted the ire of music and film industry associations over the fact they are sometimes used to exchange copyright-infringing files.

“Certainly people use convenient file sharing systems to share things they’re not supposed to share, like large pirated movies or mp3 files,” says Eisen. The site will be semi-curated, though “there’s no doubt that this tool could be used to do some nefarious deed, like any other tool”.

The BitTorrent protocol splits data from large files into small pieces (as little as 514 Kb), allowing transfer of datasets between computers containing full or partial copies. Built-in error checking means users receive an exact copy of the original. Bandwidth is shared among all computers in the transaction, instead of having a single source provide all the required bandwidth—allowing an easier exchange compared to large repositories and personal or institution servers.

But for a torrents system to work, lots of people need to participate. Its speed and effectiveness depends on the number of peers, especially those with complete copies who can act as seeds. The sum of available bandwidth grows as the number of transfers increases, scaling indefinitely.

The end result: faster transfer times, less bandwidth requirements from a single supplier, and decentralization of data. “So why not have a BitTorrent page dedicated to biological datasets?” says Eisen.

BioTorrents doesn’t require an official submission process or accompanying manuscripts, and it could expedite science for time-sensitive events (such as H1N1 or SARS outbreaks) and between large international groups of collaborators. “Anyone can post their most recent data, software, or results immediately,” Langille says.

“Someone could download all the Nature papers and post them there, but we’re not encouraging that,” Eisen jokes. All PLoS papers are already on BioTorrents.

Comments

  1. Report this comment

    Trevor said:

    That’s a great idea. I’m surprised that more academic-focused BitTorrent sites haven’t popped up already. BioTorrent seems to be really focused, but the fact of the matter is that sharing big datasets is a problem perfectly suited to BitTorrent.

  2. Report this comment

    Hilary said:

    This is a great idea (and one might say long overdue), however I wish the article or this post had briefly discussed some of the current political implications of using BitTorrent specifically for sharing scientific data. BitTorrent was the focus of the recent debates over net neutrality as Comcast and Verizon both have threatened to throttle BitTorrent connections. How would/could throttling by commercial network providers affect sharing of scientific data? Should the possibility of using BitTorrent for sharing scientific data be accounted for in the debates on net neutrality?

  3. Report this comment

    cheap computer said:

    I think this very unique and great idea that separate tool being introduced to share scientific data which is Bit Torrent. This site will be concise and focus on the subject

  4. Report this comment

    Cheap Computers said:

    BioTorrents can act as a central listing of results, datasets, and software that can be browsed and searched.All data is open-access and any illegal file sharing is not allowed on BioTorrents.

  5. Report this comment

    Anonymous said:

    Yup. Couldn’t agree more really. I’ve spent far too much of my time lately finding data, extracting it from pdfs and re-formatting it, rather than doing actual science.

  6. Report this comment

    Skin Care Reviews said:

    This is a great idea (and one might say long overdue), however I wish the article or this post had briefly discussed some of the current political implications of using BitTorrent specifically for sharing scientific data. BitTorrent was the focus of the recent debates over net neutrality as Comcast and Verizon both have threatened to throttle BitTorrent connections. How would/could throttling by commercial network providers affect sharing of scientific data? Should the possibility of using BitTorrent for sharing scientific data be accounted for in the debates on net neutrality?

  7. Report this comment

    hp toner said:

    BioTorrents can act as a central listing of results, datasets, and software that can be browsed and searched.All data is open-access and any illegal file sharing is not allowed on BioTorrents.

  8. Report this comment

    dell gx280 said:

    Indeed a very good read! Very informative post with pretty good insight on all aspects of the topic! Will keep visiting in future too!

Comments are closed.