« European Chemistry Congress: Viszontlátásra | Main | Sympathy for the chemist »

I still haven't found what I'm looking for


In this month's issue of Nature Reviews Drug Discovery, Monya Baker wrote a 'News & Analysis' piece about open access chemistry databases: though there are a number of free chemical databases (including PubChem, which I blogged about last spring), "the chemical data [in these open access databases] still pale in comparison to what already exists in other databases and the published literature."

One problem is that it takes a great deal of time to collect data for these large databases: "PubChem's director, Stephen Bryant, says he lacks the staff and mandate to collect data from published literature and patents." So it's not surprising that the Chemical Abstracts Service (CAS) database contains more chemical information than PubChem: whereas PubChem has about eight million unique structures, CAS contains nearly 30 million organic and inorganic substances.

In addition, there are some concerns about quality control in these open access databases: "[t]he screening data [in PubChem] are less rigorous than those in peer-reviewed articles, and contain many false positives. Deposited data aren't curated, and so mistakes in structures, units and other characteristics can and do occur." I can't imagine how frustrating it would be to synthesize a molecule that was listed as a 'hit' in one of those databases just to find out that it was inactive because someone mixed up the stereochemistry (or omitted a double bond)...

What are your experiences with these databases? Have you used them in your own work? If so, were they useful? What would you do to make them better? Do you think that the problems with these open access databases are the sort of 'growing pains' that happen for any new technology/database, or is there something special/unique about developing open access chemistry databases?

Joshua


Joshua Finkelstein (Associate Editor, Nature)

TrackBack

TrackBack URL for this entry:
http://blogs.nature.com/cgi-bin/mt/mt-tb.cgi/1118

Post a comment

Comments will be reviewed by the editors before being published. You can be as critical or controversial as you like, but please don't get personal or offensive. We strongly encourage you to use your real, full name. Email addresses are required: this is in case we need to discuss your comment with you privately, or notify you in case we decide not publish your comment. Email addresses will not be made public on the blog.


Please enter the numbers you see below - this helps us to cut down on spam. If you are having trouble with this system, you can instead e-mail a comment to 'thescepticalchymist at boston dot nature dot com '.

Subscribe

Subscribe to this blog's feeds:

[What is this?]

Recent Comments

Out of 862 total comments,
the most recent were:
Powered by
Movable Type 3.2