« New edition of Nurture is out | Main | Key narrative »

Bookmark in Connotea

Making names and descriptions available to all

Three Correspondence letters in this week's Nature (447, 142; 2007) all concern information on the web, in rather different ways.
Mark Gerstein and colleagues raise the oft-discussed question of structured abstracts in journal articles: that is, an abstract that contains bold headings to introduce the text. The difference here, though, is that the structured abstract is for digital publication, and would use a web standard such as XML or OWL, to allow automated literature mining.
In another letter, Douglas Crawford highlights the Human RefSeq database as a standard for genes that have more than one name: a common occurrence. Associations between genes can only be made accurately when the gene and all its synonyms can be correctly identified. If genes in a publication were identified via RefSeq, genomic analysis would be more likely to identify genes of common interest
Finally for this week, Quentin Wheeler and Frank Krell comment on a Commentary in Nature's Linnaeus special issue. They say that mandatory online registration of taxonomic names should accompany any new species description, to ensure true accessibility and knowledge.

Comments

I see how structured abstracts may additionally accompany literature articles to facilitate automated data mining, but why have "literature" at all in this context? Why not establish mandatory global public databases for deposition of data that fall into easily defined categories. Please let there remain some room for literary ways of describing novel ideas.
Mark Wall

The ongoing discussion about the database revolution (Nature 445,229-230;2007) and the role of human curation (Gerstein, Nature 447,142;2007) versus 'text mining' (Hahn, Nature 448, 130; 2007) misses one very important point. The debate so far suggests that for recovery of facts from texts we deal with an either-or dilemma. However, nowadays computational analysis of text (which goes far beyond classical text mining) and the involvement of the expert community in the correction of mined facts from existing and newly created texts can be combined. I have addressed this dilemma before in an editorial (Which gene did you mean ?, BMC Bioinformatics. 6, 142; 2005), but meanwhile the field has progressed significantly. The expert community, including the original authors of manuscripts can be assisted by computational analysis of their text on the fly to suggest the implicated facts. This is not necessarily restricted to new articles, but can be used for each authors' legacy publications as well, with the aim to go 'from texts to facts'.

Post a comment

Comments will be reviewed by the blog editors before being published, mainly to ensure that spam and irrelevant material (such as product advertisements) are not published . Please keep your comment brief. Excessively long or offensively phrased entries will be edited. Remember this blog is for feedback and discussion of matters concerning scientific authorship or peer-review - not for drawing attention to your research.

If you want to know if a NPG journal would be interested in your research, you will need to contact the journal's editorial office, which can be done via the authors & referees website.

We strongly encourage you to use your real, full name. E-mail addresses are required in case we need to discuss your comment with you directly. We won't publish your e-mail address unless you request it.

Please enter the numbers you see below - this helps us to avoid spam. If you are having trouble with this system, you can send your comment by e-mail to 'authors at nature dot com'.

please enter code