Nature Chemistry | The Sceptical Chymist

JJ: Day 98, Service with a ‘Simplified Molecular Input Line Entry Specification’

Hi everyone,

This week the Nature Chemistry team have been thinking about how we display our wonderful papers (when we finally open the doors and eventually publish a paper, anyway).

We’d really like to see what everyone else thinks about some of the things we discussed after looking at what other journals have to offer.

So, the things we’re interested in:

(1) HTML vs PDF: does anyone read the HTML articles? Do you read the PDF on-screen or print it out?

(2) Big vs little graphics: what does everyone else think about the tiny size of the graphics in ACS html articles?

(3) Tagging/’semantic web’: what do you think about the toys on the RSC’s Project Prospect? What kind of things would you like to see tagged/linked to other content in Nature Chemistry? For instance, Steve would love to do something with named reactions.

(4) 3D molecular structures: do these help your understanding of a paper?

(5) How useful to you are InChIs and SMILES?

(6) Forward linking: the RSC and Elsevier/Science Direct offer this – do you use it? Would you use an RSS feed that alerted you to new citations of a particular paper.

(7) Would you actually comment on papers if there was a comments box at the end?

(8) We really like the Biochemical Society’s HTML article style (sample one here) – do you?

If we could get a deluge of posts about this one, we’d be overjoyed! And this is your chance to voice your opinion on what a Nature Chemistry paper should look like.


Neil Withers (Associate Editor, Nature Chemistry)


  1. Report this comment

    Katherine Haxton said:

    HTML articles are only useful if they contain larger versions of graphics (or colour versions in some cases). PDFs are easier to read due to format (although the Biochemical Society’s HTML format looks good). The project Prospect stuff is nice but what would be nicer would be a chemistry publishing industry-wide standard for this sort of thing with all publications neatly cross referenced and linked.

    How about using an idea similar to Tag Clouds instead of/in conjunction with Keywords (there is a good bit of debate about this in the Geology blogosphere this week, places like Highly Allochthonous and Lounge of the Lab Lemming). That is, generating a list of terms to describe each article by the most frequent occurrence within the document, rather than author generated lists. Obviously common words are excluded.

    Big graphics are nice but space restraints in print will always limit that. Perhaps larger versions of figures could be available online. Similarly, access to spectral data in its orginal format might be useful. Hyperlinks to abstracts of references are pretty useful (and standard).

    A moderated comments facility would be interesting.

  2. Report this comment

    Josiah said:

    I enjoy the HTML and PDF because usually if I am looking for something specific, i.e. a method or reference I can go open up the HTML and find it quick, whereas the PDF is great format for general reading.

    Small graphics can be setup to be enlarged with a simple click. This makes loading quicker and HTML more convient.

    Links to the structure’s PDB or other 3D reference in the HTML would be nice or even a built in Jmol figure.

    The Biochemical Society’s HTML format is very awkward. It is an interesting idea for sure but doesn’t seem to be implemented in a way that is friendly. What if I want to search for keywords in the document, I can’t. In normal HTML ctrl-F and then search is oh so nice.

  3. Report this comment

    Neil said:

    Hi Egon – thanks for your excellent comments!

    Just to clarify for everyone as I used a somewhat odd term, by ‘forward linking’ I meant a service that finds citing articles – in other words, you click on a link and it brings up the papers that cite the one you’re reading. For instance, the RSC offers ‘Search for citing articles’ (here) and Science Direct has a ‘Cited by’ button, as well as the RSS ‘citation feed’, in a box like this. I think they’re the only ones who offer RSS feeds for citations – to be honest, I’m surprised Google Scholar doesn’t offer alerts like other Google-brand services.


  4. Report this comment

    baoilleach said:

    I think Nature could make a real splash by being the first journal to link to the chemical blogosphere. For example, using Chemical Blogspace’s API to find comments on a particular paper. See my greasemonkey script for examples of this (

    Forward linking would be fantastic. RSS feeds…as an author, I would like them. I don’t know if I would track an RSS for citations to papers I was just interested in.

    See the BMC Central/Chem Central approach for HTML layout. I think they have it all worked out. I would never print out the HTML version.

    Comments boxes are overmoderated and overhyped, or possibly just poorly advertised. Are the comments even emailed to the author? You never see on the front page of a journal, “latest comments by Joe Bloggs”, etc.

    Project Prospect is the way to go. You could even share some code and reduce the work.

  5. Report this comment

    zts said:

    Regarding HTML vs PDF, I never look at the HTML. If I am skimming a paper, I look first at the schemes—on the ACS HTML pages, the schemes are so small that one cannot typically read them, so there is really no point to using that format. If I am looking for something specific in the paper, it is almost always a compound in a scheme, so again, there is no point looking at the HTML page. If I want to read a paper in detail, I print out the PDF. Even if the HTML pages had legible schemes, I would still probably read the PDF because of the nicer formatting. The ACS style thumbnail schemes just seem like a waste—they make people go out of their way to see the most important parts of the paper, and they break up the flow of the article. When you enlarge them it opens a new page, so you cannot refer to both the text and the scheme at the same time. I really like the way the ACS webpage handles everything else, but their HTML articles are worthless.

    Advantages of the HTML page could be that it is easier to search for text (although this can be done in PDF as well), and the references can be hyperlinked to the papers they are referencing (I think this can be done in PDF’s but I don’t often see it).

    I like forward linking, and I find it to be quite useful sometimes, but I have only ever used it in SciFinder (find citing articles). Does it work as well from the actual journal pages? For example, does a RSC paper bring up cites from the whole of the chemical literature, or only from other RSC journals?

  6. Report this comment

    Masa said:

    Hi Neil,

    HTML: easy to view (if you have access to net), can do fancy stuff (video clips of reactions, 3D structures that you can move around), easy to tell people where you found the article (give them a link)

    PFD: Pics on PDF should be small so that it is easy to download and also doesn’t use up too much ink when printing. Also easy to store on your PC. Plus you can read it on train, air plane etc even if you do not have access to the net.

    Therefore, I like them both! (not really a helpful comment…)

    It would be interesting to have a voting system where people vote “Good” or “Bad”. This is useful(or fun) when you are just browsing and wants to find out what people thought was a good paper.

    Finally on Biochemical Society… well not my cup of tea…It contains too much information in too many places and it may be difficult to see with small screen… but may be good once my eyes get used to it…

  7. Report this comment

    McDawg said:

    Confession time.

    I’ve been a PDF “addict” for about 5 years. Sure,I’ve printed off dozens and dozens favoring the “Paper” format.

    Of late though, I’ve moved over to HTML. Typically, I only have access to Open Access Manuscripts and the layout in PLoS/BMC HTML terms are user friendly.

    I think both PDF and HTML have advantages/disadvantages and as such, at the moment, both have roles to serve.

    I too like forward thinking so we should experiment.

    As to leaving comments (7), I do this regularly.

  8. Report this comment

    Neil said:

    Thanks to everyone who’s commented so far – they’re all really useful. Keep ’em coming!

    Another set of comments can be found at Science in the Open:

    And as a PS to people having problems with Preview/Post and the security code: if you refresh/reload the page after previewing, a new security code is generated. I normally go for the safe option of pasting my comments into notepad or word, but haven’t lost anything yet.


  9. Report this comment

    Anna Croft said:

    Regarding (1) pdf vs html, I tend to read through the html if I’m not sure whether this is the article I’m looking for, or if I need some detail quickly. pdf is reserved for archiving and printing off if it is an important paper.

    (2) graphics – I don’t really mind. I think elsevier also does small graphics and click for bigness. If I am on low bandwidth of course, then I appreciate this.

    (4) 3D molecular structures, no help at all – unless I want to do some further analysis myself. But then these should be properly optimised and computationally generated structures, not some Chemdraw conversion. I can imagine this would be a lot of work for someone if it isn’t already part of the paper (but very valuable as a repository).

    (5) I’m sure SMILES is a lot more useful to me than I realise …

    (6) I use some of the citation alert services and find them useful. The forward linking I find on elsevier does not give very useful results often – there may be a better way to do this, since I think theirs is keyword based. Wrt to RSS feeds, I am finding the RSC much better in terms of drip-feeding new articles (a few per day) – I feel overwhelmed by the ‘dump’(eg 150 at once) I get from Nature/Science and the ACS for their journals.

    (7) Comments would be quite cool. And quite useful as a discussion tool for postgraduate and undergraduate journal meetings.

    (8) Not sure yet, but I can see some merits, especially to improve online reading.

    Hope this helps.


  10. Report this comment

    Chris Rusbridge said:

    (1) HTML versus PDF: yes, I read HTML articles. What’s more, I HATE reading large PDFs on-screen, especially two-column PDFs (which are just vile on-screen). The way Nature presents its material (as good quality “full text” plus PDF) makes a lot of sense to me. Mind you, if I need to save something to read later (specially offline), I prefer the PDF. Reading saved HTML files can be frustrating, as often images etc are not loaded into the saved object. And the PDF is a better choice for printing.

    Are we yet at the point where articles can be published as XML, perhaps according to the NLM archiving DTD which appears to be widely accepted?

    In the case of the PDF version, this should be a tagged PDF (not yet to be confused with the semantic web question, I think); the Nature PDF I viewed ( is un-tagged. Tagging PDF improves accessibility for people using screen readers etc, which is a legal requirement in some jurisdictions, not to mention courteous good practice. And it may improve other machine processing…

    (2) The tiny graphics in the ACS articles are awful, and really make them less useful. The Nature articles I scanned (eg seemed to have got a better balance.

    (3) Yes, surely we should be encoding more of the “science” in a semantic web fashion. Not quite sure what the right route for this is, eg RDFa or a microformat-like way. So where chemical names or reactions are mentioned, there should be invisible XML giving unambiguous machine-readable alternatives to the human-readable text.

    It’s probably also worth emphasising that supplementary material should be available, and that all data supporting the article should be openly accessible. I don’t think appropriate formats for supplementary data have settled down yet; the Echidna genome article I looked at has several PDFs, some of which contain notes and some data, plus an Excel spreadsheet containing data. The latter seems like a reasonable, pragmatic short-term choice but a poor long-term one. It is a pretty simple 10 column, 300-row table, so a simple XML encoding might be better. Surely this sort of area is one where we might hope for Nature to give us a lead?

    (5) Identifiers and the like are really valuable: use them!

    (7) On commenting, it seems right for blogs, but less so for articles.

    (8) The Biochemical Society’s HTML style is awful, and I would guess breaks just about every accessibility rule in the book. The simple approach from Nature itself seems much better to me.

    Anyway, the very best of luck with this, and I look forward to seeing the result.

  11. Report this comment

    Chris Swain said:

    HTML vs PDF, no real preference for on screen viewing but PDF for download. On a Mac Spotlight, Coverflow and Quicklook have made searching PDF’s a brilliant way of organising literature.

    Graphics size, personally I hate small graphics.

    Tagging, useful and will probably become more so.

    3D structures, can be enormously helpful in aiding understanding, especially if some effort is made to include an element of interactivity. I guess the issue might be a viewer but Jmol is a pretty good cross-platform solution.

    InChi and SMILES, personally I’d encourage the use of SMILES and insist that articles are tagged with the SMILES of all structures mentioned. I prefer SMILES because most people can use a SMILES as the starting point for building a query.

    RSS feeds or citations, not sure about this, if I run a search and find a relevant paper it is useful to have a list of citations, I’m not sure I’d want a daily listing, but it might be useful for authors.

    Comments, For chemistry papers I suspect it might be much more useful if the comments could include chemical structures rather than just text.

    The Biochem Soc style seems very old-fashioned, I’m sure you could do better.

  12. Report this comment

    José Moreira said:

    1) PDF definitively, 90% of my colleagues only read PDF articles, and they are easy to save and there are a lot of very good applications to manage it (I use and recommend Papers)

    2) I only read the PDF version. But Its obvious that the graphics in the HTML version are badly readable.

    3) This feature can add a lot of noise to the papers.

    4) 3D information can really help in numerous papers, this would a killing feature.

    6) Forward linking is really useful, this is the greater advantage of html versions, and it will be great if you can embed this feature on the pdf versions.

    7) I would prefer a blog to comment papers.

    I’m looking forward to seeing the results on nature chemistry.

  13. Report this comment

    FX said:

    (1) PDF is way superior to HTML articles; even better would be having the possibility for authors to include 3D models and movies in the online PDFs. Presenting 3D/animated information gives sometimes much better insight than still images.

    (3) I’m not sure tagging has as much added value as people usually consider.

    (4) Oh, see 1: 3D molecular structures really help understand

    (5) As a reference (ie in a note), they’re very useful. No need to display them too prominently, however.

    (6) “Would you use an RSS feed that alerted you to new citations of a particular paper?” – Definitely, if it’s possible to combine citations of more than one paper in an RSS feed.

    (7) No.

    (8) I don’t really like it, it wastes lot of screen space and renders poorly on mobile devices.

  14. Report this comment

    CEJ said:

    PDF are a must have (which I save to my computer and only read on screen). I don’t like having to have a network connection in order to do something that doesn’t have to have the connection; connecting can be a problem at times-especially when traveling.

    3D figures would be REALLY nice, when working with systems of large molecules you really need to be able to rotate to see what is going on.

    Forward linking is so very useful.

    It would also be very helpful to have a link to download the citation data in common formats (bibtex, Endnote, Pages, etc).

  15. Report this comment

    Thomas Munro said:

    (1) I don’t read the HTML version. Firstly, a digital scroll is like a robotic mule. If you want to travel long distances, you don’t ride a mule, you drive a car. If you want to read a long document, you don’t read a scroll, you read a book.

    Secondly, publishers seems to think they’ve discharged their high-tech obligations by adding hyperlinks to the references section in HTML. But as all previous posters note, it’s the PDF that people print and store. So the copy they end up actually using lacks that invaluable information, and they have to waste time recovering information that the publisher has collected and then thrown away – the dois of references. I’d be delighted if you included dois in your reference format, especially in the pdf version.

    (2) If you read the HTML version, you get what you deserve.

    (3) Linking to chemical structure information, as in Prospect and Nature Chemical Biology, would be nice, and would save a lot of wasted effort for readers reentering complex structures into their software. Highlighting terms doesn’t strike me as useful, more like bloat for the sake of it.

    (4) 3D views would definitely help with complex structures.

    (5) InChis are not yet useful because they’re not in widespread use. But they will clearly be of tremendous use once they’re established: each molecular structure will finally have a single unambiguous ‘name’, and vice versa. It will finally be possible to search the literature for compounds using search engines, rather than searching expensive proprietary databases with time-lags, errors and omissions.

    SMILES suffers from multiple competing versions (generating different strings for the same structure), and from not being an open standard. More importantly, by supporting SMILES, the journal would be perpetuating a format war, which would guarantee that neither format would achieve ubiquity.

    Your journal could play a key role in establishing InChis and dragging chemistry into the machine-readable era. Authors could include them as an appendix, just as ACS journals used to require CAS numbers.

    (6) Forward linking would be good if it covered all publishers.

    (7) As long as the identity of the commenter was firmly established, comments would be a wonderful addition. If 10 people commented that they couldn’t replicate a procedure, that would be very useful information. Authors could respond, possibly giving extra detail.

    (8) That format is no more or less unendurable than other HTML formats.