African astronomy and how one student broke into the field

Posted on 28 Feb 2018 by Jack Leeming

Africa is investing in a future of astronomy research, but students need access to inspirational lecturers, says Gina Maffey.

Mutie at the Ghana Radio Astronomy Observatory (GRAO) at Kuntuse, Ghana

Isaac Mumo Mutie

What do you do when the degree you want to study is not offered by your university?

You study it anyway.

“I did a lot of personal research online, looking for answers” says Isaac Mumo Mutie, an astronomy student who studied at the Technical University of Kenya. While studying for a Bachelor of Technology in Technical and Applied Physics, Professor Paul Baki introduced Mutie to astronomy, and Mutie would consult with him in his spare time.

“He would ask me ‘why are you interested? This is not part of the curriculum.’ But I insisted.” Continue reading →

TechBlog: Software quality tests yield best practices

Posted on 23 Feb 2018 by Jeffrey Perkel

{credit}Alexandros Stamatakis/GitHub{/credit}

Life science research increasingly runs on software. A good fraction, perhaps even most of it, is made by academics, for academics: Rough around the edges, perhaps, but effective — not to mention free. But, is it of high quality?

Alexandros Stamatakis decided to find out.

Stamatakis is a computer scientist and bioinformatician at HITS, the Heidelberg Institute for Theoretical Studies in Germany, and a professor of computer science at the Karslruhe Institute of Technology. His team has been developing and refining software tools for evolutionary biology for more than 15 years, he says, including one called RAxML (from which the code snippet shown above was pulled). Yet for all that time, he says, his code still wasn’t perfect.

“The more I developed it the more bugs I had to fix and the more I started worrying about software quality,” he says.

Not software ‘accuracy’, mind you — when it comes to phylogenetics, it’s difficult to know whether software is providing the correct answer. “You don’t know the ground-truth,” Stamatakis says. Rather, he was curious whether popular tools meet computer-science standards for quality.

To find out, Stamatakis and his team downloaded the code for 16 popular phylogenetic tools (plus, as a control, one from the field of astronomy), which collectively have been cited more than 90,000 times. They then ran those codes — 15 of which were written in C/C++ and the last in Java — through a series of tests.

For instance, they looked at how well software can scale from a desktop computer to a large cluster, something that increasingly is necessary as life science datasets balloon in size. They measured the amount of duplicated code in the software to get a rough indication of maintainability. And they counted the number of so-called ‘assertions’ — logical statements in the code that assert, for instance, that a value falls within a certain range, and that cause the software to terminate should they fail — to obtain a measure of code ‘correctness’.

“There have been empirical studies by computer scientists working in the field of software engineering, where they showed that there is a correlation between incorrect code, or code defects, and the number of assertions used — or let’s better say, an anti-correlation,” Stamatakis says.

So, how did the toolset do? Not too well.

As documented in an article published 29 January in Molecular Biology and Evolution, none of the 16 programs in the round-up, including Stamatakis’ own RAxML, aced all the tests. (With 57,233 lines of code, RAxML exhibited both compiler warnings and memory leaks.) But, he stresses, that is neither to denigrate the programmers who wrote those tools — who, after all, were simply trying (and generally succeeding) to solve a particular problem — nor to suggest they do not work properly.

Rather, he says, potential users must exercise caution in using these tools. “They shouldn’t blithely trust software. And they shouldn’t view it as black boxes,” but instead (as he puts it in his article) as “potential Pandora’s boxes”.

Users should strive also to understand what their code is doing, Stamatakis advises. And if unexpected results arise, repeat them using a separate tool that performs the same task, to ensure they aren’t chasing digital phantoms.

Stamatakis concludes his article with a series of ‘best practices’ for software developers. These include running tests for memory allocation errors and leaks, using assertions, checking for code compilation warnings using multiple compilers, and minimizing code complexity and duplication — practices that are common in professional software development but less so in the life sciences.

The tools Stamatakis’ team used to run its tests are freely available, so readers can try them themselves to see how trustworthy their chosen software is.

Journal editors, he says, should consider requiring such tests of any peer-reviewed work, either performed by the authors themselves prior to submission, or by the peer-reviewers. In fact, during our conversation, Stamatakis suggested he might make the toolbox available as a Python script or Docker container, to make it easier for others to adopt. If and when he does, we’ll let you know. In the meantime, caveat emptor!

Jeffrey Perkel is Technology Editor, Nature

eLife replaces commenting system with Hypothesis annotations

Interactive figures, a mea culpa

The extra bits of a grant application: a cheat sheet

Posted on 21 Feb 2018 by Rebecca Wild

Generic tips for your approach to a grant application

By Kate Christian

Continue reading →

TechBlog: ‘Manubot’ powers a crowdsourced ‘deep-learning’ review

Posted on 20 Feb 2018 by Jeffrey Perkel

2018-02-22_Tech-Feature_Deep-learning_WEB

{credit}Alfred Pasieka/SPL/Getty{/credit}

In Nature‘s February technology feature on ‘deep learning‘, a kind of artificial intelligence whose usage is spiking in life science research, author Sarah Webb points readers to a ‘comprehensive, crowd-sourced’ review of the field.

Available as a preprint on bioRxiv (ETA: and now online in the Journal of the Royal Society Interface), the review is indeed comprehensive: the PDF runs to 123 pages and 552 references, and has been downloaded nearly 27,500 times since May 2017. But it was an intriguing footnote on the article’s title page that really piqued my interest: “Author order was determined with a randomized algorithm”. Continue reading →

Como Se Dice “DNA”?

Posted on 14 Feb 2018 by Rebecca Wild

Good science communication is a key force in driving change, but finding common ground can be a challenge – especially across linguistic barriers, says Jacqueline Gamboa.

Continue reading →

The long and winding road for training scientists to engage the general public

Posted on 12 Feb 2018 by Rebecca Wild

Efforts to encourage better public outreach are admirable, but better communication between scientists must come first, says David Rubenson.

{credit}iStockphoto/Thinkstock{/credit}

Continue reading →

TechBlog: eLife replaces commenting system with Hypothesis annotations

Posted on 12 Feb 2018 by Jeffrey Perkel

{credit}eLife/Hypothesis{/credit}

The next time you feel moved to comment on an article in the open-access online journal eLife, be prepared for a different user experience.

On 31 January, eLife announced it had adopted the open-source annotation service, Hypothesis, replacing its traditional commenting system. That’s the result of a year-long effort between the two services to make Hypothesis more amenable to the scholarly publishing community.

Continue reading →

TechBlog: Interactive figures, a mea culpa

Posted on 07 Feb 2018 by Jeffrey Perkel

{credit}The Project Twins{/credit}

For the 1 February issue of Nature magazine, I wrote a Toolbox article on interactive figures. Unlike static PDFs or JPEGs, these figures allow users to explore the underlying data and code used to create them, for instance to zoom in on a crowded region of interest, or to probe the robustness of a computational model.

It’s an exceptionally broad and growing field of tech development, and my article name-checks more than a dozen tools. Inevitably, omissions were made, one of which was pointed out within hours of the article going live.

Continue reading →

Dummy no more: When to accept you’re no longer a beginner

Posted on 26 Jan 2018 by Rebecca Wild

You won’t always be a student, trainee, or beginner. Expertise comes from knowing your skills and constantly trying to improve, says Atma Ivancevic.

Continue reading →

Is conference Twitter a good thing?

Posted on 24 Jan 2018 by Rebecca Wild

Sharing data is important, but Twitter can be used for much more than that, says Eileen Parkes.

Continue reading →

nature.com blogs

Category Archives: Blog

African astronomy and how one student broke into the field

Africa is investing in a future of astronomy research, but students need access to inspirational lecturers, says Gina Maffey.

TechBlog: Software quality tests yield best practices

The extra bits of a grant application: a cheat sheet

Generic tips for your approach to a grant application

TechBlog: ‘Manubot’ powers a crowdsourced ‘deep-learning’ review

Como Se Dice “DNA”?

Good science communication is a key force in driving change, but finding common ground can be a challenge – especially across linguistic barriers, says Jacqueline Gamboa.

The long and winding road for training scientists to engage the general public

Efforts to encourage better public outreach are admirable, but better communication between scientists must come first, says David Rubenson.

TechBlog: eLife replaces commenting system with Hypothesis annotations

TechBlog: Interactive figures, a mea culpa

Dummy no more: When to accept you’re no longer a beginner

You won’t always be a student, trainee, or beginner. Expertise comes from knowing your skills and constantly trying to improve, says Atma Ivancevic.