Main

Archive by category: Data availability

Bookmark in Connotea

Preservation of content in electronic journals

Via Knowledgespeak press release:
Two years after a meeting calling for urgent action to preserve scholarly e-journals, the results of a survey of 1,371 library directors of four-year colleges and universities in the United States have been released.
Most library directors who responded believe their own institution has a responsibility to take action to prevent intolerable loss of scholarly records. But although larger libraries support one or more e-journal preservation initiatives, most respondents from smaller libraries are yet to support any preservation effort and secure permanent access to e-journals for their institutions.
The survey, conducted by Portico and Ithaca, raises questions about how the responsibility for preservation of critical electronic resources should be supported by the community, even as electronic resources expenditures expand substantially at libraries across the spectrum. The organizers hope that the report will be a catalyst for leaders of libraries, consortia, and other organizations to provide a mechanism for digital preservation. The full report is available for download as a PDF. (A summary is available here.) Readers are also invited to share comments and reactions in the provided online discussion space.

Bookmark in Connotea

Proposal for a centralized grant repository

Noam Y. Harel of Yale University writes in Nature's Correspondence page (Nature 452, 409; 2008):

Writing grant proposals is difficult enough; keeping track of different deadlines makes for an endless cycle of procrastination and frantic preparation. The added stack of bureaucratic forms, with arcane variations from agency to agency, can tip one over the edge as a deadline nears.
Is it almost too obvious to wish for a centralized proposal repository? Investigators could submit proposals at any time, in a common format that highlights the science rather than obliterates it with red tape. Funding agencies could search the repository for proposals matching their interests. A minimum of bureaucratic information would be required up front. Budget details could be worked out between funding agencies and investigators as necessary.
Ideally, all proposals would be publicly accessible. However, most of the scientific community has not yet accepted the inevitable dawn of truly open science. Submissions to a central repository could therefore be made accessible only to funding agencies that agree to keep proposals private (unless a submitting investigator indicates a willingness to share his or her proposal publicly).
The repository would make life easier for scientists by eliminating the hassle of searching for suitable grant mechanisms and the stress of meeting various deadlines. It would make life easier for funding agencies by expanding the pool of applications from which to choose. Of course, the best proposals could attract offers from multiple agencies. Rather than forcing investigators to choose non-overlapping sources of funding for each project, why not use the repository to mediate shared funding agreements that could benefit everyone involved? In effect, it would serve as the mediator between grant-seekers and grant-providers.
In a world where eBay, Facebook and Google powerfully demonstrate the communal nature of the Web, it is a pity that scientists and funding agencies don’t have a similarly modern forum for matching their interests and offers.

Bookmark in Connotea

Consistent guidelines for clinical interventions

An Institute of Medicine report recommends that the United States government create a programme to provide consistent guidelines for clinical interventions. The reliability of the guidelines will depend on the availability of the clinical data to be assessed, according to this month's (March 2008) Editorial in Nature Medicine (14, 223; 2008).
The problem is that "Widespread regional variation in how health care providers treat some conditions in the United States reflects the sobering fact that, for many interventions, there is no consensus about what constitutes effective clinical care. Physicians and health care providers must try to make sense of innumerable and conflicting guidelines in order to choose the best available intervention for their patient. Scientific, systematic review of data from medical literature and clinical trials is crucial to forming a reliable evidence base of what actually works in health care. With this in mind, professional medical organizations, patient advocacy groups, government agencies and others have synthesized available data on the efficacy of particular interventions and have produced guidelines recommending certain courses of action for specific conditions. The problem is that there is no consensus among the approaches to systematic review, and, more troublesome, no clear understanding of the best methods for assessing the evidence."
The Institute of Medicine has stepped in to recommend a plan to help resolve conflicting medical advice (reported in a news story at Nature Medicine 14, 226; 2008) by three methods: first, identify interventions that are priorities for evaluation; second, develop standardized and reliable methods for performing systematic reviews of all the available data about a given intervention; and third, develop standards for producing clinical guidelines. The Editorial discusses some of the practical difficulties, concluding that the Institute of Medicine report is an important step forward but will require legislation if it is to work.

Bookmark in Connotea

Nature Methods recommends deposition of proteomics data

Starting this month (March 2008), Nature Methods strongly recommends deposition of proteomics data to public repositories before manuscript submission. From the Editorial in the March issue of the journal (Nat. Meth. 5, 209; 2008):
"Several proteomics data repositories are now available that differ in terms of their goals, structure and the formats they accept. They include PRIDE, PeptideAtlas, Global Proteome Machine Database (gpmDB) and the file distribution system Tranche. The newest addition, Human Proteinpedia, is a community-based annotation tool that hosts experimental data (Nat. Biotechnol. 26, 164; 2008).
Importantly, the major database administrators have shown their willingness to work with users and with each other to facilitate data deposition. At this stage, the process can still be labor-intensive, but a repository like PRIDE provides extensive technical assistance. Under the umbrella of the ProteomExchange consortium, the major repositories are also devising ways to share their data in a collaborative fashion, capitalizing on their complementarities to minimize submission hassle while maximizing benefits.
We support these efforts and consider it premature to recommend a particular repository. Rather we will rely on community experience to determine which database or combination of databases emerges as the most useful. However, there are specific features that editors favor. In particular, we like the possibility currently offered by PRIDE and Human Proteinpedia to provide peer reviewers with access to datasets associated with a manuscript before public release, in an anonymous fashion, and to coordinate public release of the data with publication. "

Nature Methods welcomes comments on this Editorial, and the recommendations it makes, at the journal's blog Methagora.
The Nature journals' policies on data and materials availability, including links to editorials on these policies, can be found at the author and reviewers' website.

Bookmark in Connotea

Non-traditional publishing choices for biologists

Zeba Wunderlich and Kishore Kuchibhotla of Harvard University write in Nature's Correspondence page (451, 887; 2008):
The paramount importance of publishing in biology dissuades many young scientists from making non-traditional choices with regard to where and how we publish our work. My colleagues and I believe it is in our own interests to identify the shortcomings of traditional publishing and to explore other publishing possibilities that are free of those problems.
What can we do? First, learn about our options. There are several innovative developments poised to change the publishing landscape dramatically. Video publications, preprint archives and high-throughput online journals are but a few that have recently surfaced (for a discussion, see Nature Network's Publishing in the New Millennium forum).The onus is on all of us to investigate these resources and to consider how they might enrich our science.
To make a difference, we also need to contribute. Frustrated by technical difficulties in reproducing published experiments? Then publish a video protocol in the Journal of Visualized Experiments. Have you benefited from a colleague's comments at a conference? Then extend the experience, and comment on articles published by PLoS One and posted on Nature Precedings. These initiatives will take hold and achieve their full potential only with strong support from the scientific community.
If we collectively embrace these ideas, publishing will become more effective. Although the psychological and social barriers to submitting a contribution initially are surprisingly high, becoming involved has proved to be rewarding. Ultimately, scientific progress and the published record have a symbiotic relationship — improved communication will enhance the pace, progress and efficiency of research.
[Note added by Maxine: In addition to the resources mentioned above, Nature Protocols is an online resource which welcomes the upload of protocols, in video or written form, and provides users with an interactive network for comments and additions.]

Bookmark in Connotea

Research Information Network on data stewardship

The UK Research Information Network (RIN) has produced a framework of key principles and guidelines on the stewardship of digital research data for research institutions, libraries, publishers, societies and funders, produced after more than a year of wide consultation among these groups. The summary of the framework is available as a two-page PDF, and the full report as a 16-page document (PDF).
The framework is not only addressing the basic issue of the preservation of research data because it is essential to evaluate and re-assess results, but is identifying new approaches to managing and providing access to the data in an era of digitization, new technologies, aggregation and "adding value" to data by re-use.
The framework document identifies five key principles, in abbreviated form:
1. The roles and responsibilities of researchers, research institutions and funders should be defined and have codes of practice to ensure that creators and users of research data are aware of and fulfil their responsibilities.
2. Digital research data should be created and collected in accordance with international standards.
3. Digital research data should be easy to find, and access should be provided in an environment which maximises ease of use, and which provides credit for and protects the rights of those who have gathered or created data, and/or who have legitimate interests in how data are made accessible and used.
4. Models and mechanisms for managing and providing access to digital research data must be both efficient and cost-effective.
5. Digital research data of long-term value arising from current and future research should be preserved and remain accessible for current and future generations.
The full details are available at the RIN website.


Bookmark in Connotea

Protein structures in the public domain

Aled Edwards of the Structural Genomics Consortium, University of Toronto writes in a Correspondence in this month's issue of Nature Structural & Molecular Biology (15, 116 ;2008):
The Structural Genomics Consortium (SGC) is a public-private partnership that places the three-dimensional structures of proteins of relevance to human health into the public domain without restriction on use. Over the past 3 years, the SGC has deposited the structures of more than 550 proteins from its Target List into the Protein DataBank (PDB); this accounts for about one-quarter of the new structures of human proteins in the PDB over this period ('new' is defined as <95% sequence identity to proteins whose structures were already available in the PDB) and the majority of the new structures from the human parasites that cause malaria, cryptosporidiosis and toxoplasmosis. Over the next 4 years, the SGC is committing to determining the structures of another 600 proteins from its Target List, including eight human integral membrane proteins.
The SGC has been releasing the coordinates for all the SGC structures into the PDB immediately after they meet the SGC quality criteria, even if the ultimate intention is to describe the work in the peer-reviewed literature. This data release policy, which has often meant that coordinates were available for several months before the manuscript was even written, has not limited the ability of our scientists to publish.
In keeping with our policy to make our data available as soon as possible, the SGC is now also providing 'pre-released' coordinates on its website when a new SGC structure is submitted to the PDB, allowing scientists to access the structural information while the deposition files are being processed. Scientists should ensure that the revised coordinate file is downloaded once it is released by the PDB.

Bookmark in Connotea

A 'third way' for privatizing biomedical research

Ron A. Bouchard of the University of Alberta, and Trudo Lemmens of the University of Toronto, write in a Commentary in this month's Nature Biotechnology (Nat. Biotechnol. 26, 31-36; 2008) that the allocation of risks and benefits of publicly sponsored biomedical research is becoming increasingly skewed toward for-profit entities and against the public interest. A legitimate solution to this imbalance would be to levy compulsory government royalty fees on commercial products made possible by public efforts.
The authors argue that "public–private partnerships can be particularly valuable in circumstances involving large transaction costs associated with novel biomedical inventions aimed at the global public good. That said, a combination of self-interest and anxiety in the face of globalization has led to wide swings of the pendulum of S&T policy and scholarship in recent years, with argument for expansive IPR rights on the one hand and their abolition in favor of a completely open source model on the other. Neither position is likely to be balanced or workable over the long term, as both may skew too far to private or public interests." A compulsory government royalty on technologies commercialized using public money, they argue in their Commentary, is a necessary 'third way' to protect the interests of for-profit entities and those of the public.

Bookmark in Connotea

Where did the scientific method go?

Michela Noseda of Imperial College, London and Gary R. McLean of the University of Texas Health Science Center write in this month's Nature Biotechnology (26, 28 - 29; 2008) a response to the Brief Communication published by Mazor et al in the May issue (Nat. Biotechnol. 25, 563–565; 2007). What bothers Noseda and McLean is not the article itself, but that it contains, they write, "a lack of documented methodology and information that is essential to faithfully reproduce the science claimed in the manuscript. Surely, the aim of scientific publication is to disseminate scientific information to further advance our knowledge and to allow others to use such information for expansion and possible improvements to the work. Mazor et al. are clearly not the only authors being forced into abbreviated paper formats that follow this trend, which suggests the problem goes significantly deeper.
Admirably, Nature has recently implemented new guidelines for the addition of methods to their published research articles and letters. Authors are given multiple options for the appropriate presentation of methods within their manuscripts, avoiding the demotion of Methods to the supplementary section. This approach should be commended and we hope adopted universally by additional scientific periodicals. Aside from these rules, we should all make an extra effort as authors and reviewers to ensure that scientific methodology resumes its rightful position as the foundation of basic scientific research."

The Nature Biotechnology editors respond (Nat. Biotechnol. 26, 29; 2008):
"Noseda and McLean raise interesting points. With regard to the ability to reproduce a paper's methodology and findings, the fact that descriptions of methods in Supplementary Material online are not copy edited for grammar or clarity at Nature Biotechnology (or at any other Nature monthly journal) could be argued to potentially compromise the lucidness and ease with which a reader can repeat a published experiment. As the authors also point out, Nature's new guidelines for the addition of methods to its published papers provide authors with flexibility in how to present their methods within the final printed issue and online. One additional benefit to Nature's approach, not mentioned by Noseda and McLean, is that references to methods or protocols that appear in the Methods section remain in the printed paper rather than being relegated to online only (where they are less likely to be cited). We would welcome feedback from our readers as to whether they feel Nature Biotechnology should follow a similar model to Nature."

Bookmark in Connotea

Expanded licence for reuse of genome papers

From an Editorial in today's (6 December) issue of Nature (450, 762; 2007):
Although Nature and the Nature journals are built on a business model funded by subscribers and other sources of revenue, various initiatives have been implemented to enhance the accessibility of the research papers published in these journals.
They have long been freely available to researchers in the 100 or so poorest countries through the World Health Organization's Hinari initiative and others like it. Machine access is being enhanced by the open text-mining initiative of the Nature Publishing Group (NPG). Preprints of original versions of papers can be deposited in arXiv and Nature Precedings without compromising their acceptability for publication. And final authors' versions of papers can be deposited in PubMed Central and other public servers from six months after publication. Authors retain copyright of their work, whereas NPG retains the licence to publish it.
For many years, a more generous arrangement has been made for papers reporting full genome sequences. (The paper reporting the sequence and analysis of 12 species of Drosophila is the most recent example, see Nature 450, 203; 2007). These papers are freely accessible on NPG's website from the moment of publication. This recognizes a consistent character of 'genome' papers: they represent the completion of a key and fundamental research resource, describing and reflecting on what has been revealed but not usually providing insights into mechanism. Although some papers in other disciplines might also be characterized in this way, the fundamental character of the genome has led NPG to make a systematic exception.
In the continuing drive to make papers as accessible as possible, NPG is now introducing a 'creative commons' licence for the reuse of such genome papers. The licence allows non-commercial publishers, however they might be defined, to reuse the pdf and html versions of the paper. In particular, users are free to copy, distribute, transmit and adapt the contribution, provided this is for non-commercial purposes, subject to the same or similar licence conditions and due attribution.
In 1996, as human genome sequencing was getting under way, leading players stated: "It was agreed that all human genomic sequence information, generated by centres funded for large-scale human sequencing, should be freely available and in the public domain in order to encourage research and development and to maximise its benefit to society". These principles have continued to guide the field, and NPG has consistently made genome papers freely available in keeping with them. This new licence allows us to formalize the arrangement.

Bookmark in Connotea

Public accessibility of scientific databases

Last year, Nature Biotechnology ran an Editorial about the failure of a biological database:

Six weeks ago, the rights to one of biology's premier public databases were quietly sold to an informatics startup. The database in question, the Biomolecular Interaction Network Database (BIND), is arguably the most comprehensive freely accessible protein-protein interaction database available to the research community. Yet through a combination of bureaucratic delays, Canadian government fiscal nitpicking and a lack of community consensus, this important resource now finds itself on life support, its survival precariously linked to that of Unleashed Informatics, a private venture founded last April with little more than $1.0 million in seed funding from Sun Microsystems. BIND is a database of molecular associations that collates high-throughput data submissions and hand-curated information from the scientific literature……
(From Nature Biotechnology 24, 115; February 2006.)

One correspondent disagreed with the Editorial's assessment and wrote that in his opinion the enterprise had been a waste of taxpayers' money.

Rather than arguing for the importance of long-term database funding by granting agencies, BIND's saga in fact argues for greater caution and more demanding oversight when these agencies elect to fund a database's initial development.
(W. Busa, Nature Biotechnology 24, 1095; September 2006).

Now, some months later, the journal is able to publish a response from one of BIND's creators, and from another correspondent in support of the database:

On March 20 this year, Thomson Scientific (Philadelphia) acquired the BIND database together with a stable of software and services through the purchase of Unleashed Informatics (Toronto). These products were originally created by my laboratory using public funds. They were the intellectual property of my former host institution, Mount Sinai Hospital, in accordance with its employment contracts and policies. Confidentiality constraints from the outset of the discussion with Thomson Scientific, which predated Busa's letter, prevented me from addressing Busa's comments at the time. I would now like to address several misapprehensions and inaccuracies in his comments..........BIND has always had the broadest scope of any interaction database (all organisms) as well as the deepest annotation (down to atomic three-dimensional structures). BIND curators extracted information from figures—a feat no text mining tool can do and 85% of hand-curated BIND records have information arising from figures. It is the breadth, depth and quality of BIND that led to its commercial acquisition. And this was pursued only after having exhausted all possible means for continued public support.......
(C. Hogue, Nature Biotechnology 25, 971; September 2007.)
Researchers may not mind paying for the luxury of specialized databases, but data registries that cater to a broad set of users should be broadly and freely accessible to the research community. Although the initial development of databases, such as BIND, requires caution and close oversight of budgets, an equally important aim should be to ensure that data repositories of particular utility to the research community remain sustainable and publicly accessible. Databases, such as BIND, should not be left to the private sector. Ensuring public accessibility to data essential for research progress is the responsibility of the central planner, not Adam Smith's invisible hand in the marketplace.
(K. Wang, Nature Biotechnology 25, 971-972; September 2007.)
Bookmark in Connotea

Nature journal policies on proteomics data

The August editorial in Nature Biotechnology, 'Time for leadership' (Nat. Biotechnol. 25, 821; 2007) describes how the example set by leading proteomics laboratories will be a major factor in determining the successful implementation of new reporting guidelines in the wider community.
The August issue of the journal includes two perspectives that propose reporting guidelines for proteomics and molecular-interaction data sets (p. 887 and p. 894). The "minimum information about a proteomics experiment" (MIAPE) and an associated module on molecular interaction experiments (MIMIx) were developed by the Proteomics Standards Initiative of the Human Proteome Organization with the aim of standardizing the reporting of proteomics research.
The editorial goes on to state: "Whether Nature Biotechnology ultimately elects to require compliance with the MIAPE guidelines will depend on their reception by the scientific community. This March, we began recommending (not requiring) that proteomics and molecular-interaction data sets be deposited in a public repository before the associated manuscript is submitted to this journal (Nat. Biotechnol. 25, 262, 2007). But we would not consider enforcing the MIAPE guidelines until such time as the proteomics community has reached a consensus that the benefits of compliance outweigh the burden.
Before this can happen, at least two critical pieces of infrastructure must be in place. First and foremost, appropriate software tools must be developed and made freely available to all. Second, databases must improve their capabilities for transferring and storing MIAPE-compliant data sets."
We welcome your comments as the Nature journals further develop their policies in this area.

Bookmark in Connotea

August editorials on sharing, naming and credit

The Nature journals this month (August) feature several editorials on the publishing process. A short round up (with links) follows:

Nature Genetics (39, 931; 2007), in 'Compete, collaborate, compel', calls for procedures for microattribution to be established by journals and databases so that data producers have an overwhelming incentive to deposit their results in public databases and thereby to receive quantitative credit for the use of every published data accession.

In 'Got data?', Nature Neuroscience (10, 931; 2007 ) points out that data sharing is not only good citizenship for researchers, but is also required by funding agencies and many journals. The scientific community needs to develop better incentives to encourage compliance and reward those who share.

And in 'Name that gene!', Nature Structural & Molecular Biology (14, 681; 2007) warns that scientists coin new terms, or neologisms, at a tremendous pace, but name choice can have unforeseen results.

Bookmark in Connotea

Automated structured abstracts

Udo Hahn and colleagues add to the discussion "making data available to all" by describing the benefits of automated, as opposed to manual, structured abstracts (see Nature 448, 130; 2007). They write:

Mark Gerstein and colleagues in Correspondence (Nature 447, 142; 2007) propose that journals should require authors to manually provide structured abstracts to facilitate text mining of biological information. There are three main difficulties in implementing such a proposal.
First, life-science terminologies are huge, diversified and complex. This means that identifying the correct content descriptors is almost impossible for inexperienced users of online term repositories. For example, Medical Subject Headings , the International Classification of Diseases and Gene Ontology are high-volume — tens of thousands of terms — and structurally complicated terminological systems, each with different design rationales, naming conventions and principles of structural organization. Even human indexers, search specialists and database curators with routine exposure to these resources have to invest much effort in understanding and keeping track of their content as well as terminological updates and revisions. Will scientists find the time to dive so deeply into this alien terminological territory, and be capable of finding exactly what they are looking for?
Second, the coverage of existing terminologies for the many subdomains in the life sciences is incomplete. The two main terminological umbrella systems for the life sciences, the Unified Medical Language System and the Open Biomedical Ontologies, contain impressive numbers of individual terminologies, but their coverage of the life sciences is still fragmentary and suffers from varying depths of description. The size of the terminology gap is likely to be even more pronounced if authors were required to encode relational descriptions, for example indicating a binding relation between two specific proteins, P1 and P2, by Bind(P1, P2), because such a vocabulary has not yet been determined.
Third, the quality and reliability of author-supplied content descriptions is quite a hurdle. Even if the first and second problems were to be solved, human indexers, even professional ones, are liable to error as well as to the possibility of intrinsic subjective bias (M. E. Funk and C. A. Reid Bull. Med. Libr. Assoc. 71, 176–183; 1983). This is not to say that authors of a structured abstract would consciously cheat, but rather there is a grey area of overstatement and overestimation of one's own results in a highly competitive scientific environment. If authors' structured entries were subject to peer review together with the submitted article, this would be more work for the reviewers as well as the authors — neither of them likely to have been trained as terminologists.
As an alternative, we suggest automated procedures for knowledge capture in which neither the authors nor the reviewers are in the loop. There has been significant progress in automatic text mining and information extraction as well as in the methodological foundations of life-science terminologies in terms of ontologies, knowledge representation languages and semantic encoding standards. These efforts in automating the generation of content descriptions and linking them directly to biological databases are strongly experimentally founded and would help to avoid additional workload and subjectivity — see, for example, the BioCreAtIvE competition results. Once automated mechanisms for content analysis are applied, this also increases the coverage and the recency of the literature entered into biological databases, as human input is complemented by computationally generated content.
Udo Hahn, Joachim Wermter
Friedrich Schiller University Jena, Germany
Rainer Blasczyk & Peter A. Horn
Hannover Medical School, Germany

Bookmark in Connotea

What is "open science"?

Frank Gibson, a Research Associate at Newcastle University, UK currently working on the e-neuroscience project CARMEN, has written an essay Do scientists really believe in open science? , in which he collects current opinions of “Open Science”. He was stimulated to write the essay because of his role in the CARMEN project which, he writes, has exposed him to a domain of the life-sciences to which "data sharing and publicly exposing methodologies has not been readily adopted, largely it is claimed due to the size of the data in question and sensitive privacy issues."

The essay is available here. It addresses definitions of "open science" and summarizes the standards used in disciplines other than neuroscience. You can see the Nature journals' policies on data availability here, which apply to all the original research articles our journals publish. Via this web page, you can provide us with your comments and views on recent journal editorials about emerging policies on data availability in a range of disciplines and circumstances.

Among other aspects of "open science", Dr Gibson discusses the "open notebook" approach pioneered by J-C Bradley. He also notes that Postgenomic produces an "up-to-the minute list of the open science discourse". (Postgenomic is a website that tracks hundreds of science blogs and "does interesting things with that data".) "Although early days", continues Dr Gibson, "maybe even the "open science group" on Scintilla (still undecided on Scintilla) will be the place in future for fostering the open science community".
Scintilla is one of Nature Publishing Group's very latest products. It collects data from hundreds of news outlets, scientific blogs, journals and databases and then makes it easy for you to organize, share and discover exactly the type of information that you're interested in. For example, you can keep track of life science podcasts, or the latest papers on schizophrenia, DNA methylation, physics or immunology. It is free to join, so take a look at what it has to offer and, if you wish, contribute to the open science group, or join one of the many other interest groups there.

Bookmark in Connotea

Corrigendum for Nature paper on stem cells

The authors of a controversial paper on stem cells publish a correction of their work in this week's issue of Nature (447, 880-881; 2007) but state in it that the errors do not affect the conclusions of the article. A News story also in this week's issue (Nature 447, 763; 2007) describes how the paper in question, published in 2002, claimed to find evidence for so-called 'multipotent adult progenitor cells', or MAPCs, in mouse bone marrow (Y. Jiang et al. Nature 418, 41–49; 2002). The work was led by Catherine Verfaillie, now director of the Stem Cell Institute at the Catholic University of Leuven.
From the News story: The paper challenged the prevailing idea that only stem cells derived from embryos were highly flexible. Some of its results have been reproduced by other labs, but no one has been able to replicate the work independently in its entirety. "I believe that despite the hype over the mistake, we and Nature made the conclusion that the final findings of the paper still stand," says Verfaillie.
This February, an investigation convened by the University of Minnesota — Verfaillie's former institution — found that her group had used incorrect procedures in the Nature paper, and that some of the data contained in it might be flawed. The investigation was a response to questions from a reporter from the magazine New Scientist, who pointed out that the figure from the Nature paper that has now been corrected was partly reproduced with different labels in another paper in another journal, Experimental Hematology (Y. Jiang et al. Exp. Hematol. 30, 896–904; 2002).
In response to the investigation, Nature convened a peer-review panel to analyse the data from the 2002 paper. According to Nature, the experts concluded that although the figure data were flawed, the paper's conclusions are still valid. No allegations of fraud or misconduct have been levelled at Verfaillie or anyone from her group. Verfaillie says her group cannot explain how the errors in the Nature paper occurred: "Why this happened, we have not been able to determine," she says.

Bookmark in Connotea

Patent information could aid replication

Harry Thangaraj of the Oxford Centre for Innovation writes in Nature's Correspondence this week (Nature 447, 638; 2007):
Your News Feature 'The hard copy' (Nature 446, 485–486; 2007) accurately highlights the limited availability of information on stem-cell research methodologies — owing to competition among labs, the commercial value of such information and space restrictions in high-quality journals — which contributes to other labs' inability to replicate and verify the results.
It might sometimes repay scientists to look beyond conventional journals for information, in this or other disciplines, particularly to patents or patent applications. Thanks to the strict enablement requirements of patent law and patent offices in relation to inventions, one can often find more detailed methodology in patent documents than in journals with severe page limits.
A very good example of comprehensive detail in certain non-embryonic stem-cell methodologies is a PCT application WO/2006/028723 (Non-Embryonic Totipotent Blastomer-Like Stem Cells and Methods Therefor), which includes surgical procedures in organ removal, isolation of cells, and composition and preparation of culture media. In this instance, the level of detail and volume of text relating to methodology far exceeds that which many peer-reviewed journals can accommodate.
Some journals publish methodology and protocols online as Supplementary Information to the main paper or in separate publications (an example is Nature Protocols, which encourages user comments). Often, though, journals are only starting points in complex paper trails related to methods. In these circumstances, patent documents could contain the most methodology related to an invention in a single document.

Bookmark in Connotea

A new look for chemical information

In its June Editorial, which is freely available, Nature Chemical Biology (3, 297;2007) reports on new online features to enhance interdisciplinary communication and to increase the accessibility of chemical information for readers.

Most published chemical content is traditionally contained in the schemes, figures and tables of scientific papers. Authors also use abbreviations, acronyms or numbering schemes to identify specific molecules. Though these shorthand notations simplify the presentation of chemical information, they tend to make chemical papers less accessible to the general reader. This is a concern for chemical biology articles, which are intended to attract an interdisciplinary audience. Moreover, since the advent of the Internet, the way by which scientists acquire scientific information has changed. Though some scientists continue to read journal articles in print, most turn to the online HTML and PDF versions of published manuscripts. This expanded use of electronic resources offers an excellent opportunity to make chemical information more accessible and user-friendly to readers of scientific papers.

The Editorial provides details of the resources now available to authors and readers, and asks for your evaluation of what has been done so far, and your 'wish list' for new chemical or biological functionality that will foster communication and collaboration between researchers at the interface of chemistry and biology.

Bookmark in Connotea

Integrating scientific cultures

In a meeting report in the current issue of Molecular Systems Biology (3, 105; 2007), Trey Idecker, Vineet Bafna and Thomas Lemberger write that "a key challenge of systems biology is that it must integrate several disciplines, each with a very different culture for disseminating results. Within biology, manuscripts describing new work are almost always published in peer-reviewed periodicals. In contrast, within computer science and the engineering fields, new methods and results are typically presented as full-length papers at meetings and workshops. Just as journals have editorial boards that handle review of manuscripts, such conferences assemble large and reputable programme committees, which fulfill the same purpose. Publication in the best conferences, as for the best journals, is highly competitive.

This past December, several hundred scientists convened in La Jolla, California for the Second Annual RECOMB Workshop on Systems Biology (December 1–3, 2006). The meeting, which was held jointly with the RECOMB Workshop on Computational Proteomics, took place at the California Institute for Information Technology and Telecommunications in the University of California San Diego campus. RECOMB, which stands for Research in Computational Biology, has for a decade sponsored conferences that attract high-quality papers in bioinformatics, primarily from computer science.

In an effort to integrate the computational and experimental biology communities, RECOMB and Molecular Systems Biology entered into a partnership by which original, peer-reviewed papers are presented orally at the Workshop on Systems Biology and then appear as full-length manuscripts in the pages of the journal. The precise publication model was formulated after much discussion between the editors of the journal and the organizers of RECOMB. It is original and, we hope, will serve as a case study for future conferences."

See the Molecular Systems Biology website for more news of this project.

Bookmark in Connotea

Share your lab notes

Here is the full text of an Editorial in today's Nature (447, 1-2 ;3 May 2007), which is freely available. For further details of the Nature journals' policy on fraud and fabrication, see the Author and Reviewers' website. Comments on this editorial are welcome.

The use of electronic laboratory notebooks should be supported by all concerned.

Too often when errors or cases of fraud occur in science, the lab data required to reconstruct what happened have gone astray. And too often, the co-authors failed to exert due scrutiny on their colleagues' activities in order to prevent such misfortunes. The damage to personal and institutional reputations can be severe and, in rare high-profile cases, public trust can be eroded.

It is therefore in everyone's interest to pre-empt such cases as far as possible. Electronic laboratory notebooks offer a partial solution — and have other advantages too. This is despite the fact that maximizing their benefits will require a change in culture that many researchers will no doubt initially resist.

Continue reading "Share your lab notes" »

Bookmark in Connotea

Nature publishes full methods sections

For most journals, adequate space for methods is taken for granted. Nature now presents a new format to its papers that removes a longstanding shortcoming in this respect. From now on, all Nature papers requiring methods sections will be able to include all the necessary detail.
The full methods are published online only. The printed version contains a summary of up to 300 words, with a reference to the full online version. A key point is that the new online methods sections are not only sufficient for researchers wishing to replicate the work (a longstanding complaint about past Nature papers) but are also integral to the HTML (full text) and online PDF versions of the paper. (For completeness, both online versions also contain the methods summary in the print version.)
One of this week’s (5 April issue) Articles, an exciting paper on targeted fast optical interrogation of neural circuitry, represents the inauguration of this format. If you are thinking of submitting your own work to Nature, you might like to take a look at how these "methods" are displayed in the three versions of the Article: full-text online, PDF online and PDF print. Here is the full-text (HTML) version, in which the full methods run on after the end of the main paper (the paper's references are all together in one list and indexed). Here is the online PDF version, in which the full methods appear at the end of the main paper with their associated references. And if you look at the printed issue: 5 April vol 446, pages 633-669 (2007), you can see that the “full methods” are not there (but readers are directed to the online version).
We are delighted to be able to offer this service to authors. We hope you will be pleased, too.

Bookmark in Connotea

Nature Methods on sharing of software

"An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols available to readers promptly on request." This excerpt from our guide to authors may seem obvious, but judging from the number of discussions we have had with authors and referees, we would like to clarify one specific point: at Nature Methods, the definition of "materials, data and associated protocols" includes custom-designed software necessary for the method's implementation. Yet there are several ways of making software available, with various degrees of disclosure and in a choice of formats.
The details are provided in this month's Editorial in Nature Methods.


The Nature Methods editors welcome comments on this policy at Methagora, the Nature Methods blog. We also welcome your views on the application of this policy to other Nature journals.