TechBlog: Software quality tests yield best practices

Screen Shot2

{credit}Alexandros Stamatakis/GitHub{/credit}

Life science research increasingly runs on software. A good fraction, perhaps even most of it, is made by academics, for academics: Rough around the edges, perhaps, but effective — not to mention free. But, is it of high quality?

Alexandros Stamatakis decided to find out.

Stamatakis is a computer scientist and bioinformatician at HITS, the Heidelberg Institute for Theoretical Studies in Germany, and a professor of computer science at the Karslruhe Institute of Technology. His team has been developing and refining software tools for evolutionary biology for more than 15 years, he says, including one called RAxML (from which the code snippet shown above was pulled). Yet for all that time, he says, his code still wasn’t perfect.

“The more I developed it the more bugs I had to fix and the more I started worrying about software quality,” he says.

Not software ‘accuracy’, mind you — when it comes to phylogenetics, it’s difficult to know whether software is providing the correct answer. “You don’t know the ground-truth,” Stamatakis says. Rather, he was curious whether popular tools meet computer-science standards for quality.

To find out, Stamatakis and his team downloaded the code for 16 popular phylogenetic tools (plus, as a control, one from the field of astronomy), which collectively have been cited more than 90,000 times. They then ran those codes — 15 of which were written in C/C++ and the last in Java — through a series of tests.

For instance, they looked at how well software can scale from a desktop computer to a large cluster, something that increasingly is necessary as life science datasets balloon in size. They measured the amount of duplicated code in the software to get a rough indication of maintainability. And they counted the number of so-called ‘assertions’ — logical statements in the code that assert, for instance, that a value falls within a certain range, and that cause the software to terminate should they fail — to obtain a measure of code ‘correctness’.

“There have been empirical studies by computer scientists working in the field of software engineering, where they showed that there is a correlation between incorrect code, or code defects, and the number of assertions used — or let’s better say, an anti-correlation,” Stamatakis says.

So, how did the toolset do? Not too well.

As documented in an article published 29 January in Molecular Biology and Evolution, none of the 16 programs in the round-up, including Stamatakis’ own RAxML, aced all the tests. (With 57,233 lines of code, RAxML exhibited both compiler warnings and memory leaks.) But, he stresses, that is neither to denigrate the programmers who wrote those tools — who, after all, were simply trying (and generally succeeding) to solve a particular problem — nor to suggest they do not work properly.

Rather, he says, potential users must exercise caution in using these tools. “They shouldn’t blithely trust software. And they shouldn’t view it as black boxes,” but instead (as he puts it in his article) as “potential Pandora’s boxes”.

Users should strive also to understand what their code is doing, Stamatakis advises. And if unexpected results arise, repeat them using a separate tool that performs the same task, to ensure they aren’t chasing digital phantoms.

Stamatakis concludes his article with a series of ‘best practices’ for software developers. These include running tests for memory allocation errors and leaks, using assertions, checking for code compilation warnings using multiple compilers, and minimizing code complexity and duplication — practices that are common in professional software development but less so in the life sciences.

The tools Stamatakis’ team used to run its tests are freely available, so readers can try them themselves to see how trustworthy their chosen software is.

Journal editors, he says, should consider requiring such tests of any peer-reviewed work, either performed by the authors themselves prior to submission, or by the peer-reviewers. In fact, during our conversation, Stamatakis suggested he might make the toolbox available as a Python script or Docker container, to make it easier for others to adopt. If and when he does, we’ll let you know. In the meantime, caveat emptor!

 

Jeffrey Perkel is Technology Editor, Nature

 

Suggested posts

‘Manubot’ powers a crowdsourced ‘deep-learning’ review

eLife replaces commenting system with Hypothesis annotations

Interactive figures, a mea culpa

Biased biology: the case of the missing vaginas

vagina shield

The female genitalia of the water strider Gerris gracilicornis have a ‘genital shield’ that can block forced mating.
Source: Han, C. S. & Jablonski, P. G. PLoS ONE 4, e5793 (2009).

Genitalia are a hot topic. Interest in their diversity and rapid evolution have seen research in the field balloon in the past decade. Stories on studies of the penises of ostriches, chickens, sea slugs and a variety of insects have all made the science pages. But where are all the female genitalia?

A study published in PLoS Biology this week has quantified their dearth. Analysing 25 years of research in the evolution of genitals, the authors found a strong bias towards studying male animals — a disparity that has got worse over time. The bias, they say, is down to ingrained biases that lead researchers to see female genitalia as less important to evolution — something that, they argue, is hampering our understanding.

Malin Ah-King, an evolutionary biologist and gender researcher at Humboldt University in Berlin, and colleagues analysed 364 papers published between 1989 and 2013 that address the evolution of genitalia, and categorized them by research question and species studied. They found that the largest group, 49%, looked at male genitals alone. Just 8% of the papers looked only at female genitals, and 44% studied both (see graph).

Why the disparity? The study found that the bias is not restricted to male researchers, as papers by women biologists showed the same trend. Nor is the phenomenon something that can be blamed on old attitudes: it seems to grow stronger from 2000 onwards, even after a similar study in 2004 flagged the issue.

The authors also dismiss the notion that female genitalia are any less scientifically interesting than those of males, citing a range of studies in which variations are biologically significant — including the genital shields of water striders (Gerris gracilicornis, pictured above) and the elaborate, corkscrew vagina of the long-tailed duck (Clangula hyemalis). “There are a number of studies showing large variation in female genitals, both within and between species. But there’s a lack of knowledge and of studies,” says Ah-King.

Another reason could be that that the bias comes from male organs being easier to study than female ones. Johan Hollander, an evolutionary biologist at Lund University in Sweden, says that at least in the species he studies, sea snails, it is “hardly surprising” that males are studied, as their genitalia are external. They also present easy-to-see characteristics that are specific to each species, making them useful for taxonomy, he adds.

The authors recognize that this plays a part, as is evident, for example, in their finding that the bias is reduced in species where the female sex organs are external, such as spiders. But it is not the whole story, they say. Plenty of new techniques, such as high-resolution X-ray scanning, make research of internal, soft-tissue organs possible, says Ah-King.

Instead, they argue that at the root of the problem are longstanding assumptions about the roles of the sexes in evolution, namely the assumption that the female’s role is passive and relatively unimportant. This dates back to Charles Darwin’s theory of sexual selection, which the authors say proposed that females are generally “coy”. Darwin’s contemporaries even cast doubt over whether females had the mental abilities to choose mates, they add.

Although many of these assumptions have been overturned, evolutionary theory still emphasizes the male side of the equation, leaving studies of the female’s role to lag behind, say the authors. “We think assumptions about the dominant role of males and lack of variation in females have influenced how people have been looking at these questions,” says Ah-King. The fact that the study found that the bias varied depending on which evolutionary mechanism researchers were tackling, suggests that certain questions steer researchers towards focusing on males, she adds.

The problem with this nineteenth-century hangover is that by studying just one sex, researchers risk examining “just one side of a very complex equation” and are prone to misinterpreting complicated co-evolutionary dynamics, say the authors. From looking only at the long, hairy ‘virga’ of the male Euborellia plebeja earwig, for example, it would be easy to assume it was an efficient tool for removing a competitor’s sperm. But studies of the female show that her sperm-storage organ is even longer, meaning she can influence which sperm she keeps. Although studies on such specialized penile structures are common, “too often the female is assumed to be an invariant container within which all this presumed scooping, hooking, and plunging occurs,” say the authors.

Elizabeth Pollitzer, director of Portia, a London-based non-profit organization that seeks to address gender issues in science, says that male-oriented language such as “wounding the female”, “competitor sperm”, “forced copulation” and “coercive mating” run through interpretations of genital function and sexual dynamics. This is an issue not just for science, but helps to reinforce societal gender attitudes and stereotypes regarding male-female roles, she says.

The University of Wisconsin–Madison is among the institutions hoping to reverse the bias. Last month, the university appointed anthropologist Caroline VanSickle as the first Wittig Postdoctoral Fellow in Feminist Biology, a research position aimed at uncovering and reversing gender bias in biology.

vaginas

Studies of animal genitalia that look at male organs only (black) outnumber those that look at both sexes (blue) and especially those that focus on female organs (green).
Source: Ah-King, M., Barron, A. B. & Herberstein, M. E. PLoS Biology 12, e1001851 (2014).