Making names and descriptions available to all

Three Correspondence letters in this week’s Nature (447, 142; 2007) all concern information on the web, in rather different ways.

Mark Gerstein and colleagues raise the oft-discussed question of structured abstracts in journal articles: that is, an abstract that contains bold headings to introduce the text. The difference here, though, is that the structured abstract is for digital publication, and would use a web standard such as XML or OWL, to allow automated literature mining.

In another letter, Douglas Crawford highlights the Human RefSeq database as a standard for genes that have more than one name: a common occurrence. Associations between genes can only be made accurately when the gene and all its synonyms can be correctly identified. If genes in a publication were identified via RefSeq, genomic analysis would be more likely to identify genes of common interest

Finally for this week, Quentin Wheeler and Frank Krell comment on a Commentary in Nature’s Linnaeus special issue. They say that mandatory online registration of taxonomic names should accompany any new species description, to ensure true accessibility and knowledge.


  1. Report this comment

    Mark Wall said:

    I see how structured abstracts may additionally accompany literature articles to facilitate automated data mining, but why have “literature” at all in this context? Why not establish mandatory global public databases for deposition of data that fall into easily defined categories. Please let there remain some room for literary ways of describing novel ideas.

    Mark Wall

  2. Report this comment

    barend mons said:

    The ongoing discussion about the database revolution (Nature 445,229-230;2007) and the role of human curation (Gerstein, Nature 447,142;2007) versus ‘text mining’ (Hahn, Nature 448, 130; 2007) misses one very important point. The debate so far suggests that for recovery of facts from texts we deal with an either-or dilemma. However, nowadays computational analysis of text (which goes far beyond classical text mining) and the involvement of the expert community in the correction of mined facts from existing and newly created texts can be combined. I have addressed this dilemma before in an editorial (Which gene did you mean ?, BMC Bioinformatics. 6, 142; 2005), but meanwhile the field has progressed significantly. The expert community, including the original authors of manuscripts can be assisted by computational analysis of their text on the fly to suggest the implicated facts. This is not necessarily restricted to new articles, but can be used for each authors’ legacy publications as well, with the aim to go ‘from texts to facts’.

Comments are closed.