Today I unexpectedly ventured far from my comfort zone to learn about ketosamines and 2-deoxyglucose in cancer treatment on one hand, and boronic acid-based sensors on the other, so I will not attempt to explain the details here as I would undoubtedly get many things wrong. Instead, I have an important question for you.
I got into an interesting conversation this afternoon with two card-carrying chemical biologists (by which I mean, they not only do chemical biology research BUT read our editorials!!) who were curious about our recent editorial calling for the more judicious use of ‘data not shown’. One scientist made the point that it is reasonable to use this term when you have the data, and the data could be produced upon the request of an interested referee or reader, but it’s obvious from the text how the data look and so there’s no great need to show every tiny detail. The other scientist said that it’s better for all relevant information/data to be available in the paper, so that it can easily be reproduced and so it’s not necessary to call on the author to produce the data at some later point (especially considering that online Supp. Info. is pretty unlimited these days).
The question is: what data should be shown? What is completely obvious and really just takes up space? What may be completely obvious but is still critical to be included? What do people think is obvious but is not? For example, does just listing NMR peaks and splittings constitute ‘data not shown’, and should we obtain copies of all the original spectra? What about showing the data points/plots used to calculate IC50 values vs. just listing the numbers? Are you annoyed if people include too much information? How could methods be presented more clearly? Why do we need all this data shown – do we not trust each other to do/interpret the work properly, or not trust each other to report our work accurately, or is it simply a matter of having a complete scientific record?
Anyway, I may not know much about ketosamines, but I did read somewhere that you need sleep to help prevent cancer, so I’d best get to it!
Catherine (associate editor, Nature Chemical Biology)
Related: Everyday Scientist discussed recently how to improve the Supporting Information in manuscripts.
I rarely if ever experience frustration over having too much data to go through if I’m trying to reproduce a result. However, if I’m reading a paper outside my expertise, I can see how one might get bogged down with trying to decipher truly routine and unremarkable supplementary data. So I think it really depends on who the target audience is. Both of those viewpoints do have their limits, though.
The other issue at hand, however, is that of academic honesty. And honestly, I think unscrupulous people will just find a way to include whatever extra data is required to get that fake publication out there. NMR FIDs can be faked.
I have to admit that I am extremely skeptical of ‘data not shown’. It is far too easy to abuse the concept and I would be quite content if I never saw it in another article again.
I think the example you give of NMR is not ‘data not shown’. It is quite conventional to assign an NMR spectrum, give shifts and couplings and splittings and not show the image of the spectrum. The idea is to give sufficient information for the result to be reproduced. I find that many papers that show the spectrum don’t interpret it and that is lazy, and infuriating.
Supp Info should be the information necessary to reproduce and verify the work, and should eliminate the need for ‘data not shown’. It should be a meaningful interpretation of results (spectral data etc.) where appropriate, and experimental/computational details sufficient for a reasonably competent grad student to reproduce. If online journal formats were fully used, it would be possible to link the paper to this vital additional information.
How about reference that reads ‘Private communication’? Then is it legal for referees to ask the author for his/her private letters?
Andrew – I don’t think we see that as much as you used to (perhaps people are publishing more?), so it’s never been an issue for me as an editor. It also throws up another problem, and we editors should request to see something (eg an email) granting permission from whoever the private communication came from – it might have been a throwaway comment, or something passed on confidentially.
Propter and Joel – I think you’re right, supplementary information does essentially remove the need to ever use the phrase. And again, I don’t think it’s something we see very often in the papers we publish, thankfully.
Yes, I only encounter this occasionally. But this is still weird. Maybe I should describe it in more detail:
Is it an old style of scientific conduct originated from the time of Faraday or even Newton? I swear I have seen more than one time that this kind of things were finally allowed in published form in some modern journals.
Recently I came across a paper where NMR peaks were somewhat misreported. It was only after I looked into the Supp Info at the original spectrum that I noticed how the authors had confused their peaks. Not a big deal, but without the spectrum I would have wondered forever why one of my peaks was off base.
These days, there really is no excuse for not putting your spectrum (and your FIDs. Pretty please?) into the Supp Info, available for download by anyone who needs to double check. I could even imagine including lab notebook copies in there.
And more often a reference is an in-press article.
Andrew – good point. But actually, this is one of the few ‘cheats’ we allow, since the ‘in press’ label means that at least someone (i.e., the referees of that other paper) has decided that the work is acceptable. We get copies of those ‘in press’ articles, and the referees have access so that they can verify that what the authors are saying in the new manuscript is true. I guess my thinking is that this is one way to control for the different speeds at which things get published at different journals. Is it better to just avoid the ‘in press’ papers altogether? Or, assuming we keep it, is there a better way to do it?
The problem of ‘in press’ references is that it remains ‘in-press’ at the printed version almost forever. Readers can only try to search the author, journal, and year of publication provided in the ‘in-press’ citation for the target paper they need. Mostly they cannot find the paper, or they simply find a wrong paper. I assume that there are a large portion of readers who don’t want to totally rely on the reviewers of the journal and play the role of reviewer themselves.
This raises the question whether a peer review process stops upon the publication of a paper. Zhiming Wang, editor-in-chief of Nanoscale Research Letters, believes in the concept of Post-Peer Review (PPR), peer review after publication. In fact every paper is subjected to PPR process by all of its readers. Guidelines of submission should not only facilitate the reviewers invited by the the journal but also its readers.
I think the easiest improvement on ‘in-press’ reference is to require authors also include the title of their in-press paper in the citation, even if the title may has minor changes after publication. This greatly minimizes the possibility that the readers found the wrong paper.
Virtually all major publishers now use the doi (digital object identifier) to manage their production workflows, therefore “in press” manuscripts can often have their doi associated with them even before they are published, so that they can be retrieved in future. I am mildly surprised that so many “in press” references do exist still, as more and more journals move to advance online publication, with the print version following when space is available.
“personal communications” are less useful to readers – in any event, they require the permission of the person who is being cited in this informal way.
“Data not shown” is harder to support in these days of online-only supplmentary information.
Good thoughts, Andrew. Certainly refereeing does not stop with publication of the paper, and readers-as-referees definitely have an important role to play in gauging the importance of the work more broadly than we can capture with 3-4 ‘official’ referees’ opinions, and in determining whether the data/conclusions will stand the test of time. I’ve been meaning to blog about some related points – I will push that higher up the list.