New rules for the presentation of statistics in the Nature journals are described in the June Editorial of *Nature Cell Biology* (**11**, 667; 2009). From the Editorial:

Thanks to advanced imaging technologies and better integration with molecular and systems approaches, cell biology is undergoing something of a renaissance as a quantitative science. Robust conclusions from quantitative data require a measure of their variability. Cell biology experiments are often intricate and measure complex processes. Consequently the number of independent repeats of a measurement can be limited for practical reasons, yet the variability of the measurements can be rather high. Cell biologists have developed good intuition to guide their analysis of such constrained datasets. Biological complexity and the reliance on intuition can cause culture shock to physical scientists crossing over into cell biology (a kind of extension of the celebrated ‘two cultures’ concept of C. P. Snow).

With the arrival of quantitative information and ‘-omic’ datasets, statistical analysis becomes a necessity to complement instinct. The problem is that statistical tools are built on basic assumptions such as the independence of replicate measurements and the normality of data distribution. Usually, sizeable datasets are prerequisite for statistical analysis. Alas, these can be as hard come by as a biostatistician (*n* is typically well below 5). The result is that all too often statistics (frequently undefined ‘error bars’) are applied to data where they are simply not warranted.

There are no easy solutions to rectify the prevalence of poor statistics in cell biology studies. However, an obvious recommendation is to consult a statistician when planning quantitative experiments. Consider whether *n* represents independent experiments (you may actually be publishing a measure of the quality of your pipette!) and whether it is large enough for the test applied. Avoid showing statistics when they are not justified; instead, show ‘typical’ data or, better still, all the measurements. Importantly, displaying unwarranted statistics attributes a misleading level of significance to the data. Always describe and justify any statistical analysis applied. We have updated our guidelines to reflect these recommendations. One key rule: if the number of independent repeats is less than the fingers of one hand, show the actual measurements rather than error bars. If you wish to present error bars, include the actual measurements alongside them.

Finally, please remember that you are interrogating a complex system — be careful not to discard ‘outlier’ data points on a whim, as they may well be as relevant as clustered measurements. One is naturally inclined to ignore data that does not match the hypothesis tested, but biology is rarely as black and white as we would like. Do not make ‘hypothesis driven’ research become ‘hypothesis forced’!

Report this comment

It looks like whoever wrote the statistical guidelines didn’t ask a statistician to take a look….

1. Error bars aren’t associated with tests.

2. Reporting n for “independent experiments” is going to become nonsensical very quickly: the moment you report means of groups of observations in a hierarchical design. Which level do you use? The notion of “independent experiments” breaks down (indeed the notion of degrees of freedom takes a battering once you introduce random effects). I can’t really see why this is being insisted upon: there are times when it is reasonable to report n’s like this, but surely not always. Imagine plotting 50 estimates, with standard errors, and being forced to give the sample too. Who cares? The standard error tells you the variability, and the methods section should give a reasonable idea about the n’s (e.g. their mean and range).

Aaagh! No data meets the assumption of the test.

Esa Läärä made the amusing point ( pdf) that these sorts of test are pointless: either you only have a small amount of data, so the test will never be “significant”, or you have a lot of data, in which case the test will almost always be significant, but it won’t matter, because large sample theory kicks in. The grey zone in between is probably fairly small. And if the analysis isn’t robust to the assumptions, insisting on a non-parametric test is too restrictive: there are robust regression methods, for example. Or one can explore

whythe analyses aren’t robust: often you can show that it doesn’t make a difference, or that only one or two outliers affect the inference.Sorry, this is a semi-regular rant. One of the occupational hazards of being a statistician. There is some good advice in there too, but one shouldn’t be too prescriptive, or you end up forcing scientists into doing stuff which they know is silly, if they’re allowed to think about it for a bit.

Report this comment

Thanks, Bob – I will pass on your helpful comments to the Nature Cell Biology editors.