TechBlog: PacBios are hackable, too

{credit}Pacific Biosciences Inc.{/credit}

Sometimes, a DNA sequencer is more than it seems. In this month’s Technology Feature, I talk to the researchers who have figured out ways to squeeze new life from an outdated DNA sequencer, the Illumina GAIIx. That’s a popular choice for sequencer-hackers, but not the only one. Stanford structural biologist Joseph Puglisi uses a PacBio RSII from Pacific Biosciences to plumb the biochemistry of protein translation.

The RSII was designed as a single-molecule DNA sequencer, in which powerful cameras capture the flashes of light that result when a DNA polymerase molecule tethered to the base of a microscopic well inserts a fluorescently labeled base into newly synthesized DNA. But according to Jonas Korlach, the company’s chief scientific officer, that’s just one of its applications. “Yes, it’s a sequencer, but at the same time it’s also the world’s most powerful single-molecule microscope.”

All that’s required to make that microscope record something other than DNA synthesis, fundamentally, is for researchers to replace the tethered DNA polymerase with another enzyme, and to add the appropriate fluorescent reagents. To alter the running conditions, researchers also need PacBio to ‘open’ its system software to afford them greater control — for instance, to adjust experimental temperature, imaging conditions, and fluid addition. According to Korlach, just four instruments worldwide have been tweaked in this way. (As with the Illumina hardware discussed in the Technology Feature, such hacks only work on PacBio’s older RSII; the newer Sequel is not hackable, Korlach says.)

The company offers these researchers what support it can, but because they are pursuing home-brew applications, Korlach says, researchers who run into technical issues must solve them in-house. “They are mostly on their own.”

Researchers have used these modified systems to address the biophysics of cell-cell interaction, transcription, splicing, and in Puglisi’s case, translation. Puglisi’s is a structural biology lab, and structural methods tend to provide static pictures. But biology is dynamic. So, his team typically pairs the methods up. “We always like to couple structural investigations with some way to animate the structure and bring it to life,” Puglisi says. Since 2014, the lab has published some 25 studies using the RSII to study the ribosome.

In one recent study, for instance, Puglisi’s team studied the impact of modifying one particular carbon atom in the backbone of RNA. That modification, they found, causes the ribosome to pause, possibly in order to allow ancillary biological processes, such as protein folding or protein processing, to occur.

“The biology of the system really still needs to be worked out, but the dynamic behavior and structural signatures that we saw were so striking that … there has to be some neat biology here,” Puglisi says.

Korlach, who worked with Puglisi on some of his earliest efforts on the RSII, says the team, with Puglisi’s postdoc Sotaro Uemura (now at the University of Tokyo) worked out these methods on nights and weekends, when the laboratory was otherwise unoccupied. And he recalls the excitement of getting the system to work that first time.

“It was pretty thrilling when we saw the first traces of real-time dynamics of ribosome translation,” he says. “That was the first time any human had ever seen a ribosome make a protein in real time on a single-molecule level, with codon resolution. Those are the types of milestones that as a method developer you live for.”

 

Jeffrey M. Perkel is Technology Editor, Nature

 

Suggested posts

Lattice light-sheet microscopy gets an AO upgrade

‘Carbon rainbow’ enables highly multiplexed microscopy

New instruments advance mass spec imaging

TechBlog: Git: The reproducibility tool scientists love to hate

{credit}PLOS Comput Biol, 12, e1004668 (2016){/credit}

Early in his graduate career, John Blischak found himself creating figures for his advisor’s grant application.

Blischak was using the programming language R to generate the figures, and as he iterated and optimized his code, he ran into a familiar problem: Determined not to lose his work, he gave each new version a different filename — analysis_1, analysis_2, and so on, for instance — but failed to document how they had evolved.

“I had no idea what had changed between them,” says Blischak, who now is a postdoctoral scholar at the University of Chicago. “If the professor were to come back and say, ‘which version did you use to create this figure?’ I would have had no idea.”

Later, while attending a workshop on basic research computing skills, he discovered a better approach: Git.

Continue reading

Lattice light-sheet microscopy gets an AO upgrade

AO-LLSM microscope photo

{credit}Betzig Lab, Janelia Research Campus{/credit}

In late 2014, just a month after learning he had won that year’s Nobel Prize in Chemistry for superresolution microscopy, Eric Betzig and colleagues described a technique that has taken the microscopy world by storm.

Continue reading

Put your email inbox on a low-spam diet

tumblr_nnzhjl7AAQ1uv17mmo1_1280Mark Clemons has published over 250 papers over the past two-plus decades, nearly all of them involving breast cancer. So imagine his surprise when Clemons, a medical oncologist at the University of Ottawa, Canada, received a flattering email inviting him to submit his work to, of all places, a journal focusing on yoga research.

Continue reading

TechBlog: Software quality tests yield best practices

Screen Shot2

{credit}Alexandros Stamatakis/GitHub{/credit}

Life science research increasingly runs on software. A good fraction, perhaps even most of it, is made by academics, for academics: Rough around the edges, perhaps, but effective — not to mention free. But, is it of high quality?

Alexandros Stamatakis decided to find out.

Stamatakis is a computer scientist and bioinformatician at HITS, the Heidelberg Institute for Theoretical Studies in Germany, and a professor of computer science at the Karslruhe Institute of Technology. His team has been developing and refining software tools for evolutionary biology for more than 15 years, he says, including one called RAxML (from which the code snippet shown above was pulled). Yet for all that time, he says, his code still wasn’t perfect.

“The more I developed it the more bugs I had to fix and the more I started worrying about software quality,” he says.

Not software ‘accuracy’, mind you — when it comes to phylogenetics, it’s difficult to know whether software is providing the correct answer. “You don’t know the ground-truth,” Stamatakis says. Rather, he was curious whether popular tools meet computer-science standards for quality.

To find out, Stamatakis and his team downloaded the code for 16 popular phylogenetic tools (plus, as a control, one from the field of astronomy), which collectively have been cited more than 90,000 times. They then ran those codes — 15 of which were written in C/C++ and the last in Java — through a series of tests.

For instance, they looked at how well software can scale from a desktop computer to a large cluster, something that increasingly is necessary as life science datasets balloon in size. They measured the amount of duplicated code in the software to get a rough indication of maintainability. And they counted the number of so-called ‘assertions’ — logical statements in the code that assert, for instance, that a value falls within a certain range, and that cause the software to terminate should they fail — to obtain a measure of code ‘correctness’.

“There have been empirical studies by computer scientists working in the field of software engineering, where they showed that there is a correlation between incorrect code, or code defects, and the number of assertions used — or let’s better say, an anti-correlation,” Stamatakis says.

So, how did the toolset do? Not too well.

As documented in an article published 29 January in Molecular Biology and Evolution, none of the 16 programs in the round-up, including Stamatakis’ own RAxML, aced all the tests. (With 57,233 lines of code, RAxML exhibited both compiler warnings and memory leaks.) But, he stresses, that is neither to denigrate the programmers who wrote those tools — who, after all, were simply trying (and generally succeeding) to solve a particular problem — nor to suggest they do not work properly.

Rather, he says, potential users must exercise caution in using these tools. “They shouldn’t blithely trust software. And they shouldn’t view it as black boxes,” but instead (as he puts it in his article) as “potential Pandora’s boxes”.

Users should strive also to understand what their code is doing, Stamatakis advises. And if unexpected results arise, repeat them using a separate tool that performs the same task, to ensure they aren’t chasing digital phantoms.

Stamatakis concludes his article with a series of ‘best practices’ for software developers. These include running tests for memory allocation errors and leaks, using assertions, checking for code compilation warnings using multiple compilers, and minimizing code complexity and duplication — practices that are common in professional software development but less so in the life sciences.

The tools Stamatakis’ team used to run its tests are freely available, so readers can try them themselves to see how trustworthy their chosen software is.

Journal editors, he says, should consider requiring such tests of any peer-reviewed work, either performed by the authors themselves prior to submission, or by the peer-reviewers. In fact, during our conversation, Stamatakis suggested he might make the toolbox available as a Python script or Docker container, to make it easier for others to adopt. If and when he does, we’ll let you know. In the meantime, caveat emptor!

 

Jeffrey Perkel is Technology Editor, Nature

 

Suggested posts

‘Manubot’ powers a crowdsourced ‘deep-learning’ review

eLife replaces commenting system with Hypothesis annotations

Interactive figures, a mea culpa

TechBlog: ‘Manubot’ powers a crowdsourced ‘deep-learning’ review

2018-02-22_Tech-Feature_Deep-learning_WEB

{credit}Alfred Pasieka/SPL/Getty{/credit}

In Nature‘s February technology feature on ‘deep learning‘, a kind of artificial intelligence whose usage is spiking in life science research, author Sarah Webb points readers to a ‘comprehensive, crowd-sourced’ review of the field.

Available as a preprint on bioRxiv (ETA: and now online in the Journal of the Royal Society Interface), the review is indeed comprehensive: the PDF runs to 123 pages and 552 references, and has been downloaded nearly 27,500 times since May 2017. But it was an intriguing footnote on the article’s title page that really piqued my interest: “Author order was determined with a randomized algorithm”. Continue reading

TechBlog: eLife replaces commenting system with Hypothesis annotations

eLifeHypothesis

{credit}eLife/Hypothesis{/credit}

The next time you feel moved to comment on an article in the open-access online journal eLife, be prepared for a different user experience.

On 31 January, eLife announced it had adopted the open-source annotation service, Hypothesis, replacing its traditional commenting system. That’s the result of a year-long effort between the two services to make Hypothesis more amenable to the scholarly publishing community.

Continue reading

TechBlog: Interactive figures, a mea culpa

d41586-018-01322-9_15422878

{credit}The Project Twins{/credit}

For the 1 February issue of Nature magazine, I wrote a Toolbox article on interactive figures. Unlike static PDFs or JPEGs, these figures allow users to explore the underlying data and code used to create them, for instance to zoom in on a crowded region of interest, or to probe the robustness of a computational model.

It’s an exceptionally broad and growing field of tech development, and my article name-checks more than a dozen tools. Inevitably, omissions were made, one of which was pointed out within hours of the article going live.

Continue reading

TechBlog: ‘Carbon rainbow’ enables highly multiplexed microscopy

nmeth.4578-F3Fluorescence microscopy has transformed the life sciences. By attaching fluorescent dyes or proteins to cellular structures, researchers can image fine cellular morphology; track molecular localization, motion, and dynamics; and more. But fluorescence microscopy also presents significant obstacles. One of those is multiplexing.

Continue reading