« Reactions - Marcus Weck | Main | Time, time, time »

Structurally unsound

Are chemists anally retentive when it comes to chemical structures? Making sure that structures are error-free is certainly vital for a chemistry paper (and for an editor, one of the biggest headaches of the job). Just one wedge bond displayed as a hash could completely confuse the take-home message of a paper.

So imagine how annoying it would be if you saw a structure being repeatedly published with errors in it, and in lots of different places. This is just what has happened to Ian Fleming.

Back in 1967, he published a paper in Nature that finally nailed the absolute configuration of the structure of chlorophyll (Nature subscribers can see the paper here – it’s well worth a look). Yet he reckons that since then, whenever he has seen the structure reproduced, there is a 50:50 chance that the stereochemistry will be wrong.

Over the years, he’s tried to correct this where possible, including, on one occasion, an incorrect structure on a book cover. But it still happens. Out of curiosity, I had a look at the structure on Wikipedia - and sure enough, it was wrong (see for yourself, but be quick; I’ll contact them shortly to get it corrected). The actual structure can be found here at PubChem.

Who knows how often this happens? But then again, if a structure appears somewhere that isn’t necessarily directed at chemists (such as in the Wikipedia entry), does it really matter? Is it just the chemist’s equivalent of getting upset about the incorrect use of an apostrophe? I think it does matter - especially in sources on the web, which are increasingly being mined for technical information. But if you think I should just take a cold shower and calm down, by all means let me know.

Andy


Andrew Mitchinson (Associate editor, Nature).

TrackBack

TrackBack URL for this entry:
http://blogs.nature.com/cgi-bin/mt/mt-tb.cgi/3515

Comments

You're absolutely right, Andy - chemical structures are the language of chemistry, and I would argue accuracy is even important there than for conveying thoughts in normal languages: While substituting one word for another may not get across the exact meaning the writer was trying to convey, the consequences of incorrect structures can be serious and far-reaching, as I'm sure total synthesis folks would agree! (although that comment may set off a debate about the sometimes serious consequences of not being able to convey the exact desired meaning of a literary thought...) Like you, I routinely have to go look up chemical structures as I help get papers ready for publication, and it is quite rare that I don't find more than one structure of the same compound available somewhere on the web (looking in places like PubChem, Wikipedia, and chemical supply houses). Grr!

Fortunately, when I contact people to tell them the structures are wrong, they are usually quite happy to know.

There is also a long and carefully documented history of mirror image DNA.

There is a LOT of junk out there on the online databases. Even as the host of one of them I am upfront and honest about the challenges.

I blogged about this issue recently in regards to Taxol.. see http://www.chemspider.com/blog/?p=64

The original Chemspider database was built on PubChem (http://www.chemspider.com/blog/?p=76) but has since added over 8 million unique structures.

Chlorophyll is at http://www.chemspider.com/RecordView.aspx?id=4938375 and was sourced from PubChem so SHOULD be correct. It is Chlorophyll a based on a google search of the InCHI string using http://www.google.com/search?q=InChI=1/C55H73N4O5.Mg/c1-13-39-35(8)42-28-44-37(10)41(24-25-48(60)64-27-26-34(7)23-17-22-33(6)21-16-20-32(5)19-15-18-31(3)4)52(58-44)50-51(55(62)63-12)54(61)49-38(11)45(59-53(49)50)30-47-40(14-2)36(9)43(57-47)29-46(39)56-42;/h13,26,28-33,37,41,51H,1,14-25,27H2,2-12H3,(H-,56,57,58,59,61);/q-1;+2/p-1/b34-26+;/t32?,33?,37-,41-,51+;/m0./s1/fC55H72N4O5.Mg/q-2;m/b34-26+,42-28-,43-29-,44-28-,45-30-,46-29-,47-30-,52-50-;

The question NOW for me is that at the page http://www.chemspider.com/RecordView.aspx?id=4938375 which of the registry numbers is correct. Look at this for a discussion of how bad THAT situation is!

http://www.chemspider.com/blog/?p=137

To the ChemSpider person: Your structure doesn't show any stereochemistry for the methyl groups on the tail of the molecule.

To the Sceptical Chymist: I have uploaded new structures to the Wikipedia page on Chlorophyll. I hope I got them right!

I am not sure that everyone is aware what ChemSpider is. It is an open Access database of over 20 million chemical structures. An overview of the system is given here: http://www.chemspider.com/docs/ChemSpider_Overview_SLides_August_2007.pdf

One of the things I am working on is to resolve MANY of the issues existing in terms of quality of data. I recommend reading the Taxol blog here: http://www.chemspider.com/blog/?p=168

This proves how complex the issue of data quality is.

Post a comment

Comments will be reviewed by the editors before being published. You can be as critical or controversial as you like, but please don't get personal or offensive. We strongly encourage you to use your real, full name. Email addresses are required: this is in case we need to discuss your comment with you privately, or notify you in case we decide not publish your comment. Email addresses will not be made public on the blog.


Please enter the numbers you see below - this helps us to cut down on spam. If you are having trouble with this system, you can instead e-mail a comment to 'thescepticalchymist at boston dot nature dot com '.

Subscribe

Subscribe to this blog's feeds:

[What is this?]

Recent Comments

Powered by
Movable Type 3.2