EUtils, Pipes, Ubiquity, preprints and semi-automatic categorisation

I’ve made a few web service-related tools recently that might be of interest to Nascent readers, so here they are collected in one place:

  • PubMed’s EUtils API is powerful and fast, but a bit awkward to use in Javascript (you have to make two requests, and deal with parsing the XML). This Yahoo! Pipe provides a JSON proxy for the major EUtils commands, making it much easier to search PubMed from a web application (like the one below…). More information.
  • Mozilla Labs has produced an excellent Firefox add-on called Ubiquity, which essentially provides bookmarklets with super-powers. People have been writing all kinds of scripts for searching and manipulating information; this one is for searching PubMed.
  • Stemming from a conversation with Hilary about publishers’ preprint policies, this tool will take some text (in theory the abstract of a paper you’re thinking of posting to Nature Precedings as a preprint), send the text to JANE which recommends journals that publish similar content, then look up each of those publishers in ROMEO to find out their preprint policies. Those publishers which are happy to accept papers already available as preprints (which includes Nature Publishing Group) will show up in green.
  • The last of these tools performs a comparison of the semi-automated, semi-manual matching of MeSH terms to papers in PubMed (I believe they’re suggested by UMLSKS then manually curated) with the similarly auto-manual detection of Wikipedia terms in abstracts by GoPubMed and the corresponding categories applied to those papers in Wikipedia. This was inspired by Rattle Research and Chris Sizemore’s work with BBC news articles.

EUtils, Pipes, Ubiquity, preprints and semi-automatic categorisation

I’ve made a few web service-related tools recently that might be of interest to Nascent readers, so here they are collected in one place:

  • PubMed’s EUtils API is powerful and fast, but a bit awkward to use in Javascript (you have to make two requests, and deal with parsing the XML). This Yahoo! Pipe provides a JSON proxy for the major EUtils commands, making it much easier to search PubMed from a web application (like the one below…). More information.
  • Mozilla Labs has produced an excellent Firefox add-on called Ubiquity, which essentially provides bookmarklets with super-powers. People have been writing all kinds of scripts for searching and manipulating information; this one is for searching PubMed.
  • Stemming from a conversation with Hilary about publishers’ preprint policies, this tool will take some text (in theory the abstract of a paper you’re thinking of posting to Nature Precedings as a preprint), send the text to JANE which recommends journals that publish similar content, then look up each of those publishers in ROMEO to find out their preprint policies. Those publishers which are happy to accept papers already available as preprints (which includes Nature Publishing Group) will show up in green.
  • The last of these tools performs a comparison of the semi-automated, semi-manual matching of MeSH terms to papers in PubMed (I believe they’re suggested by UMLSKS then manually curated) with the similarly auto-manual detection of Wikipedia terms in abstracts by GoPubMed and the corresponding categories applied to those papers in Wikipedia. This was inspired by Rattle Research and Chris Sizemore’s work with BBC news articles.

Yahoo! SearchMonkey

Yahoo! SearchMonkey lets anyone create “applications” that enhance the Yahoo! search results of users that have chosen to add them.

Following last.fm’s lead today, I created an example application for Nature Reviews Immunology – if you have a Yahoo account, you can add it to your customisations and then see an enhanced search result. [note: it seems to be intermittently not showing up, hopefully it’ll settle down soon]

It makes use of the article metadata that was recently added in meta tags to Nature.com journal articles (and could equally easily make use of any metadata embedded as RDFa – the information is extracted from the pages using PHP and XPath).

Yahoo! SearchMonkey

Yahoo! SearchMonkey lets anyone create “applications” that enhance the Yahoo! search results of users that have chosen to add them.

Following last.fm’s lead today, I created an example application for Nature Reviews Immunology – if you have a Yahoo account, you can add it to your customisations and then see an enhanced search result. [note: it seems to be intermittently not showing up, hopefully it’ll settle down soon]

It makes use of the article metadata that was recently added in meta tags to Nature.com journal articles (and could equally easily make use of any metadata embedded as RDFa – the information is extracted from the pages using PHP and XPath).

30-second screencast on PDFs and metadata

The new version 1.5 of OS X literature management tool Papers has lots of new and improved features, but one of the most immediately useful is the ability to parse the DOI string out of a PDF and use it to look up the metadata for that file from PubMed. As the metadata that publishers put in PDF files is generally pitiful and basically unreliable for any practical use, this is a very welcome feature.

Here’s a quick screencast demonstrating this, as well as a few other things:

Get Flash to see this player.

Download the full resolution QuickTime movie (9.5MB)

This video shows:

  1. The new landing pages for Nature articles, where abstracts and metadata are visible to readers who aren’t signed in, so you can link directly to articles and not worry that visitors will just get the default “you need to buy access to this article” page.
  2. Signing in to nature.com with the Secure Login Firefox extension.
  3. Zotero‘s recognition of the article, import of metadata and automatic collection of a HTML snapshot and the associated PDF.
  4. Reading the PDF in Skim.
  5. Importing the PDF by drag-and-drop to Papers (you can also just drag the PDF onto Papers’ dock icon).
  6. Using Papers’ “Match” button to extract the DOI from the PDF, look up the metadata from PubMed and attach it to the PDF in Papers’ library.

The Zotero step isn’t crucial – you could just use Papers and Skim – but I wanted to demonstrate both applications at the same time. Using both leads to a bit of duplication, as you’ll have metadata and PDFs in both Zotero and Papers separately, but there are possibilities for syncing between multiple bibliographic/literature management tools in the future.

This is a big step closer to getting a decent workflow for discovering, storing, reading and annotating scientific literature.

30-second screencast on PDFs and metadata

The new version 1.5 of OS X literature management tool Papers has lots of new and improved features, but one of the most immediately useful is the ability to parse the DOI string out of a PDF and use it to look up the metadata for that file from PubMed. As the metadata that publishers put in PDF files is generally pitiful and basically unreliable for any practical use, this is a very welcome feature.

Here’s a quick screencast demonstrating this, as well as a few other things:

Download the full resolution QuickTime movie (9.5MB)

This video shows:

  1. The new landing pages for Nature articles, where abstracts and metadata are visible to readers who aren’t signed in, so you can link directly to articles and not worry that visitors will just get the default “you need to buy access to this article” page.
  2. Signing in to nature.com with the Secure Login Firefox extension.
  3. Zotero‘s recognition of the article, import of metadata and automatic collection of a HTML snapshot and the associated PDF.
  4. Reading the PDF in Skim.
  5. Importing the PDF by drag-and-drop to Papers (you can also just drag the PDF onto Papers’ dock icon).
  6. Using Papers’ “Match” button to extract the DOI from the PDF, look up the metadata from PubMed and attach it to the PDF in Papers’ library.

The Zotero step isn’t crucial – you could just use Papers and Skim – but I wanted to demonstrate both applications at the same time. Using both leads to a bit of duplication, as you’ll have metadata and PDFs in both Zotero and Papers separately, but there are possibilities for syncing between multiple bibliographic/literature management tools in the future.

This is a big step closer to getting a decent workflow for discovering, storing, reading and annotating scientific literature.

Comparing Google Reader’s plans with Scintilla

According to reports of a video accidentally leaked from inside Google, the Google Reader developers have interesting plans for the future. While Scintilla works on a different scale from Google Reader (which is said to store “10 terabytes of raw data from 8 million feeds”) and also doesn’t aim for the same niche of general-purpose feed reader, there are proposals reported that would help aggregation sites like Scintilla, as well as several features that we’ve already implemented. Here’s a selection of the most interesting:

Continue reading

Comparing Google Reader’s plans with Scintilla

According to reports of a video accidentally leaked from inside Google, the Google Reader developers have interesting plans for the future. While Scintilla works on a different scale from Google Reader (which is said to store “10 terabytes of raw data from 8 million feeds”) and also doesn’t aim for the same niche of general-purpose feed reader, there are proposals reported that would help aggregation sites like Scintilla, as well as several features that we’ve already implemented. Here’s a selection of the most interesting:

Continue reading