« July 2007 | Main | September 2007 »

August 31, 2007

My Daughter's DNA, Google Sky, and Reinventing Academic Publishing

Here's a summary of some way-cool things that have passed my way lately, especially following SciFoo.

MyDaughtersDNA.org

Last week's issue of The Economist had a great piece about some of the practical implications of personal genomics, not least for health insurance. It begins with this anecdotal account:

“If you can make a good soufflé, you can sequence DNA.” That assertion sounds preposterous, but Hugh Rienhoff should know. When his daughter was born about three years ago, she suffered from a mysterious disability that stunted her muscle development. After many frustrated visits to specialists, Dr Rienhoff, a clinical geneticist and former venture capitalist, decided to sequence a specific part of her genome himself. He discovered that her condition, which most resembled a rare genetic disorder known as Beals's syndrome, was probably due to a new genetic mutation. “Without a lab and for just a few hundred dollars, you can contract or outsource almost all the steps,” he explains.

What a well-connected and highly motivated scientist in California can do today the rest of the world will be able to do tomorrow. Indeed, a number of firms are already offering tests for specific ailments (or predispositions to ailments) directly to the public, cutting out the medical middle-man. Dr Rienhoff, for his part, will soon launch MyDaughtersDNA.org, a not-for-profit venture intended to help others to unravel the mysteries of their family's genes in the way that he unravelled those of his own.

Hugh was at SciFoo and held a session about this (which I managed to miss, but he told me about it later). I think it's an amazing story, mixing DIY biology with personal genetics and a great human story — a kind of Lorenzo's Oil for the 21st Century. But while you wait for the movie to come out, check out Hugh's site, where he's trying to help parents and patients in similar situations. I think he's doing something truly amazing here and I wish him every success.

Google Sky

Speaking of SciFoo alumni, Carol Christian from the Space Telescope Science Institute, who attended with colleague Alberto Conti, was kind enough to email about their work (which they weren't quite ready to reveal at SciFoo) on Google Earth's new sky features. But Ian, our resident physics geek, has long since beaten to me to posting about that.

Reinventing Academic Publishing

Meanwhile elsewhere, Jim Hendler (of Semantic Web fame) has posted some post-SciFoo thoughts on reinventing academic publishing. I strongly agree with one of Jim's main points, which is that (as I would put it) getting "social software" right is 90% "social" and only 10% "software". For example, MediaWiki is great, but Wikipedia took off primarily because of the idea and the community around it, not the quality of the code. Conversely, wikitorials failed because there was no community and because it's a terrible use for a wiki. Jim also makes a great point about scientists' inherent conservatism (which I deal with on a daily basis in our various web projects):

While scientists have gloried in the disruptive effect that the Web is having on publishers and libraries, with many fields strongly pushing open publication models, we are much more resistant to letting it be a disruptive force in the practice of our disciplines.

...

If we don’t think through the social issues of usage, the technologies alone will not have any significant impact, and will go largely unused.

...

One option, and I’d like to see more effort expanded in this area, is that this is a case where innovation needs to come “from the top.” Eventually, as these young scientists become the leaders of our fields they would bring these new technologies with them, but with the world in its current shape, we can’t afford to wait that long. Rather, we need to find ways to bring more senior scientists into contact with the positive side of these technologies.

It's great to read a Semantic Web visionary opining so insightfully about Web 2.0, not least because I (unlike some) see these two approaches as mainly complementary, not competitive.

On that note, Ian (who, when he's not blogging about Google Sky, runs Connotea, our social bookmarking service) has another great post , this time on his Nature Network blog, about some intriguing work that Connotea user Benjamin Good has done to unify folksonomy- and ontology-based approaches. From the info page:

[EntityDescriber] is a mechanism for intersecting the Semantic Web with the normal Web. It lets Connotea users (though we may extend it to other systems such as Del.icio.us) annotate (tag) resources on the Web with terms from existing controlled vocabularies such as MeSH, the Gene Ontology, the Atom ontology, and the Person ontology.

I have no idea how well this will work, but it's a great thing to try.

Less interestingly, but still relevant, I have a piece about Web 2.0 in science in the latest issue of CT Watch. More important, it's part of a special issue on "The Coming Revolution in Scholarly Communications & Cyberinfrastructure" and the other authors are far more knowledgeable and august than me, so do check out their contributions.

New content in Signaling Update and Neuroscience Gateway

UCSD-Nature Signaling Update

Signaling Update is a one-stop online resource designed to keep you in touch with the latest and most exciting research in cell signaling. New content is uploaded every Friday.

New in Signaling Update this week:

Neuroscience: Fear is sent PAKing. The Rac1–Cdk5 signaling pathway regulates the learned response to fear by modulating the activity of the PAK1 serine/threonine kinase.

The text of the accompanying original research article (Nature Neuroscience 10, 1012-1019 (2007)) will be freely available for three months.

Previous Featured Articles:

Lymphoma cells: Hedging against apoptosis. Hedgehog signaling protects lymphoma cells against apoptosis by upregulating Bcl-2.

Tumor suppression: p15Ink4b joins the club. A potent tumor suppressor role for the cyclin dependent kinase inhibitor p15INK4b is uncovered in INK4a-null cells.

Recently-published Molecule Pages: Matriptase, Nkd1 and Rgs9

And much more...

Neuroscience Gateway

The Neuroscience Gateway, a comprehensive source for the latest research, news and events in neuroscience and genomics research developed collaboratively by the Allen Institute for Brain Science and Nature Publishing Group.

New in Neuroscience Gateway this week:

Bitter memories: An inhibitor of protein kinase M zeta erases taste-association memories.

The text of the accompanying original research article (Science 317, 951-953 (2007)) will be freely available for three months.

Previous Featured Articles:

The Itchy and Scratchy show: Researchers identify a receptor that specifically signals itch sensation.

and much more...

August 29, 2007

BarCamb Cambridge

Barcamp is an ad-hoc gathering born from the desire for people to share and learn in an open environment. The name of the event is an homage to Foo Camp, but unlike it's bigger brother Barcamps are open to anyone who wants to go along. If you go to the Barcamp homepage you'll see a list of upcoming meetings being held across the globe. It's pretty clear that the desire for people to get together to discuss cool ideas is pretty strong.

With all that in mind last Friday saw a Barcamp take place in my own backyard. Matt Wood from the Sanger Institute organised Barcamb; BarCamp Cambridge. I, and about 30 other people gathered on Friday morning at the Sanger institute. Coffee and cookies on hand we gathered around the whiteboard for the Foo format of scrabbling to fill in slots on the white board with talk ideas. Two rooms were on hand, but in the end there was just enough time in the day to fit all of the sessions into one of the rooms, so that everyone could hear all of the discussions. I took notes on most of the discussions, but I might have missed one or two, so omission here is no indicator of lack of quality. If you want to get the full skinny on the meeting I'd advise going along to the next one you can make it to. Anyway, a few days on and here is what I can reconstruct from the very rewarding day:

Matt talks about microformats and science

The rough theme of the meetings was tools for science, however there was a nice diversity in the topics presented. Matt opened with a discussion of the semantic web for science. The gist of his argument is that there are two types of semantic web. There is the Semantic-Web with capitals that comes with all of the specifications in place, full RDF and support for all of the machinery that goes with this. For the sort to medium term he identified two significant problems with this. The first is that most researchers don't have the inclination to learn all of the machinery to work with outputting data in this format it is miraculous enough to get them to work with well formatted HTML, but more on that later. The second issue is that getting funding in science to do Semantic-Web related projects is hard. Funding bodies at the moment, outside of computer science, just don't want to go there. His solution is to use the lowercase semantic-web. This means adding minimal amounts of micro-formatting to HTML documents, and creating a marketplace of markup. If your system is good, it could gain de facto acceptance in a me-too way. Put it out there because it is easy to put it out there, and if it is good it will be used. (Bioformats.org from Matt is an attempt to do just that with microformats for biology). Standardisation can come later, or not. In the Q & A a danger to this approach was pointed out where the domain experts may loose control of the translation of the de facto standard into a standard ontology if when that process happens they leave it to the computer science people. Later in the day this issue was returned to.

Laura James, Alertme and the network of things

Laura James from AlertMe.com spoke next. The company she is working for are creating a consumer product which attempts to open up the internet of things. By provide a set of relatively cheap sensor arrays connected to a hub that talks to a server the owner can set up whatever behaviour they can think of for the network. Sensors include motion sensors, accelerometers, light detectors and sensors that can switch. The technology is based on a very low powered wireless network protocol called Zigbee. The hub runs on linux with Python on top of it. The basic chip is small. The first app they are trying to sell is a home security application. There was some interesting discussion about how one could match a component in the system to it's abstract representation in the web interface. The question was "how can I tell which one is the hallway monitor?" to which someone replied, "it's the one in the hallway". "Ahh", said the questioner, "you have obviously not lived in a house with children who like to move things around". Luckily for such bedevilled patrons one will be able to write on the sensors. She said that they are interested in hearing novel ideas for how this kit could be used. Someone suggested that it could possibly be introduced to the bench and help with automating tasks, or with auto-data capture an enabling the open-notebook approach to science.

Simon Ford and bootstapping hardware hacking

Simon Ford then talked about Rapid Prototyping with Microcontrollers. I've never done any embedded programming, but Simon has created a platform on an embedded device that when plugged into a computer appears on the computer as a flash drive. Pulling executable programs on to this device allows them to run. He is also working on a web interface to the compiler for the processor. Right there in front of us he made the leds on the device blink. This is the "hello world" of embedded programming and it usually takes three days to get working. He made a very salient point; reducing the chain of complexity in a process increases the confidence in the result. When you go through a lot of steps then when you do get a result (such as a flashing light on a chip), you have an intrinsic skepticism. Every extra step is a potential barrier. By reducing the 'hello world' of hardware from a three day slog to two minutes you create a system that people can have a lot more confidence in. He wants to lower the barriers so that software people can extend their ideas to hardware, and hardware people can bridge the gap back to software. With Simon's light weight system developing programs for embedded systems could become something that can be taken into schools or other environments and allow people with little experience of embedded programming to begin to explore this space.

Akex Griekspoor and document workflow

Alex Griekspoor gave a presentation on Papers.app, a program that he created with Tom Groothuis. It is designed to be the iTiunes for pdf paper management and runs on a mac. What drove Alex and Tom to do this is their belief in the place for the dedicated desktop program in the scientific work flow. I'd seen Alex talk about this before at Nature so my notes from the BarCamp meeting are a bit scarce, however they have won three apple design awards for their work.

Michael Dales and Quentin Stafford-Fraser, why two, no four, no eight screens are better than one

Michael Dales and Quentin Stafford-Fraser presented the work they are doing with Ndiyo. The Guardian has a nice article about what this non-profit company is doing. In a nutshell, multiple monitors, one computer. With linux as the OS the idea to be able to provide multi-seat internet cafes to the developing world in a box. To achieve this they use an on-chip video compressor that can then send the signal across either USB or ethernet. They demoed a working version of their system. It was very impressive. The benefits are legion, not least of which is that this solution can provide up to a factor of 20 power saving over the traditional model of having a PC for every screen, per screen.

Jeff Viet and open source CMS

Jeff Viet demoed Drupal as a content management system. During his presentation he created a new content type within Drupal. After experimenting with lots of CMS systems his conclusion was that "OS CMS systems beat the crap out of the free ones for what you get for your money, including support. If you have a budget then you can get in touch with the authors of the OS systems easily"

Peter Corbett and conversations with computers

Peter Corbett talked about teaching computers to understand text. He described himself as someone who had a desk at the computer lab and at the chemistry lab. Now he works on computational linguistic chemistry with the aim of auto-detecting language in chemistry papers to try to recognize chemicals and then auto-markup these papers. The idea is to supplement the mark-up from publishers. His system can also draw the chemical and annotations and overlay them on the paper Some problems that they encounter is are that there can be new names in papers, compact names, include extra hyphens, his program can deal with these kinds of things to a certain extent. You can go from plain text to something like a connection layout using an information rich markup The RSC is using this software along with human-cleanup to create markup of chemistry papers. The hope is that you can then do semantic search over papers. One of the gems from his talk is he described a small natural language processing trick. Imagine we were interested in opiates, we could just ask google "opiates" but if you take into account the structure of language and you search for phrases like "opiates such as" you will get a much better result in your search. there are many patterns like this, and I think he said that they are known as Hirst patterns, though I may have misheard this. He did a pass over abstracts on pubmed for these kind of patterns to make a network of relationships. It turns out that you can do reasoning on structure as well as processes using this analysis. A few bits of wisdom from his work was that most of the information has come from biochemists rather than chemists, more biologists are into open science, and open databases. Chemisty has been mostly captured by commercial interests, and it is hard to get free chemistry data. It is important to define what you are looking for so that you can evaluate how well the software has done. and it's important to remember that in a lot of text there is a difference between what you think the world looks like and how it is described in the literature.

lunch mmmmm

Over lunch I chatted with James Smith, who is head of the internet team for Ensembl. One of the issues that they are facing is the cost of storing and retrieving very large data sets. This is a growing issue in science in general, so I was pretty interested in hearing how they currently deal with it. At the moment the machines that sequence the DNA are so fast that is is almost getting to the point that retrieving stored data from a sequenced gene is slower than re-sequencing the gene. Almost but not quite.

James Smith and a large ensemble of data

He gave a talk about Ensembl, which grew out of the human genome project about 8 years ago to prevent the commercialization of genomic data. The idea was to have an open source human genome. The ensemble projects takes the raw data from the genes and adds other data to this, such as reference data from other experiments. Now the project has two main products, the ensembl code and the the data produced by the project. They have data on 41 genomes, and the code is also used in contexts apart from this project. There are probably about 100 installed copies world wide. It's 1.5 million lines of perl code and there is a public mysql interface to the data. There is also an archive system to see old data and everything is in CVS. They have 35 species in ensemble, human, mouse rat and zebra fish were amongst the first to be sequenced, and then there are random mammals for example hedgehogs which to my mind is a good example of a random mammal. We learned that the claw of the platypus is poisonous. He described the data load they are running (very high, one of the biggest quesry sets in the world), the hardware that is running it (very big, and lots of it). He said that they are moving over to AJAX because people don't realize that items in the interface are buttons or forms. In spite of data-mining activities, most of the interactions with the site are human interaction

James Graham and more HTML

James Graham gave a tour of HTML5. He is just a member of the mailing list, but stressed that this list is still open, and anyone who has an opinion on the way it should work is still free to sign up. For me the most interesting nugget to come out of this talk is that HTML4 is underspecified. No one knows how to parse it, because there is no specification. All of the voodoo of parsing it is tied up inside current browser technology, and this is why there are so many problems with cross-browser compatibility. The main advantage that HTML should give us is the ability to know how to parse HTML5 and dirty HTML4. Another reason why this effort is important is that there is a lot of information tied up in HTML that is never going to make it to XML, and if we don't want to loose access to this information in the future we need the ability to abstract the understanding of these documents away from specific browsers. Of interest to publishers is that a "caption" tag is being defined that can associate a description of an image to an image. There are a number of new elements and the decision on which elements to be included was based on a search for the most popular class names in html files in the google codex.

Ian Mulvany and building on a social tool for science

I think it was at this point that I gave my talk. I've uploaded the slides to Slideshare. and you can see the raw notes that I took during some of the talks on my blog. I've also tagged a few pictures on flickr with barcamb.

Tom Morris and why thinking doesn't need to be difficult

The last talk of the day was by Tom Morris, a shotgun ride through the semantic web for hackers. Allthough this talk was at the very end of the day it was excellent and capped a great day. It went by pretty quickly so I only caught a fraction of the wisdom that Tom was delivering. He started by asking what's cool about microformats? I'm not sure we got to an answer to that question, but it was clear that Tom had a small gripe with the current process that the microformats community has taken for the adoption of new microformats. They ask what problem does the microformat solve? Tom suggested that perhaps there is no problem at all. He asked us what what problem did blogging solve? or Twitter for christ's sake? (that got an appreciable laugh in the audience, which I think I twittered). His point is that often no one knows what use something will have until it is popular. For example yahoo pipes is not practical yet, it is a user experience nightmare and it doesn't have a clearly defined purpose, but it is useful becasuse there is a lot of data via rss out there and it gives us room to play. The microformats route is not compatible with this. We should just be puting up data and letting people play. If if doesn't get used then Darwin will clear up the mess. The amount of interesting data is greater than the possible number of microformats. At this point he went through an example of creating your own microformat and providing a method via GRDDL of allowing anyone to use your data. If your data maps to an RDF schema then you can use it today and if not, but it is based on URI's you can make your own schema. For a free market everyone needs to take part and the semantic web is only scary if you make it scary. There is an active community of people willing to help at GetSemantic.com.

And that was it, and the day was over. It was a lot of fun, and a lot of interesting stuff was presented. My apologies to anything I've missed or mis-represented.

August 27, 2007

Microsoft and STM Publishers Meet to Discuss DOCX / Word 2007

Since my post to Nascent in June on the best way for publishers to deal with Word 2007 within their publishing ecosystems,
I have been involved in the conversation about the best way for publishers to adopt Word 2007 into their publishing ecosystems on two fronts:

  • Tracking the conversation.
    See my tagged Connotea links
  • Participating by hosting a meeting on 25 July 2007 at the NPG office in New York between staff from Microsoft (Jennifer Michelstein, Lee Dirks, Murray Sargent), AIP (Tim Ingoldsby), AGU (Carter Glass), AAAS/Science (Brooks Hanson), Inera (Bruce Rosenblum), Aries (Ben Peterson and Lyndon Holmes), and Nature (Howard Ratner and Chris Flammang).

The meeting agenda was quite simple. Discuss a typical journal publisher’s workflow from authoring through to publication and archiving. This was not specific to any of the publishers present, but rather a high level overview of the various stages involved and a quick overview of the types of software systems that have been built to aid in these workflows along with a quick overview of the standards that are important to this community. The four publishers in the room did an excellent job of relaying this to the Microsoft staff. This was then followed up by presentations from Inera and Aries detailing the problems Word 2007 was causing for editing automation tools (eXtyles) and manuscript tracking systems (Editorial Manager).

The open and fruitful conversation quickly turned toward how Microsoft, third-party vendors and publishers can work together moving forward to make Word 2007 work within the STM publishing ecosystem. Here were some of the outcomes:

  • Microsoft will establish a page on one of its websites with more advanced details on how to best use Word 2007 in a publishing environment. (For example, an image of an equation created when saving a Word 2007 file to Word 2003 actually carries important semantic information that can be reused when reopening in Word 2007 file. Microsoft refers to this as dehydrating and rehydrating.)
  • Microsoft will consider adding text to its help file with Word 2007 especially about its Math Markup Language Support.
  • Microsoft will make real efforts to educate publishers by presenting more often at publisher events.
  • Microsoft will consider supporting STIX fonts when they are available. I just heard from Tim Ingoldsby of AIP, “Microsoft seems to be trying to cooperate with the STIX Fonts project. They have supplied us with a Font Edit program that will let us get the STIX Fonts ready for Word 2007 (so they can be used in place of Cambria Math).”

If you can’t wait for Microsoft’s information site to go live have a look at Bruce Rosenblum's excellent summary of the current state of play.

It is also worth reviewing the press release from Design Science, producers of MathType, about Word 2007, equations, and scientific journal submissions. Also be sure to read their toolbar tip which gives instructions on how to add a button within the Word 2007 quick-access toolbar for the legacy equation editor.

More information is sure to be published on this topic in the next few months as vendors ramp up to better accept the new DOCX format. Be sure to check out my Connotea bookmarks on this topic.

August 22, 2007

Sky Gazing

Google have just released a new version of Google Earth with Sky gazing abilities. I use the term Sky gazing rather than star gazing as some of the most impressive aspects are the ability to zoom into high resolution images of nebula, galaxies and other extended astrophysical objects. I've just been playing around with this interface for the past half an hour or so and it is, as we have come to expect, georgeous.

I think this is going to have a big impact on the participation of the general public with science. Google Earth has already made a significant impact as described in an earlier post.

Of all the areas of science, astronomy has some of the best levels of amateur involvement. After all, the data set is just lying around above our heads. Some notable examples include the group of observers who liked to keep track of spy satellites and the amazing Rev. Robert Evans whose work on detecting supernovas provided some of the best data in that field until cosmological studies industrialized the acquisition of supernova data.

Data from an astronomical experiment is usually for a specific purpose or for a specific object in the field of view, but there is often a large amount of extra data that you get for free and that might be of use to other researchers. After the initial experiment is tested the remaining data is usually opened up to the community. Virtual observatories represent a way to share this information. You have a big data server that handles requests based on wavelength, survey and position and you get the raw data. You can try for yourself here, here, or read about ongoing work here, but I think it is fair to say that these efforts are tailored to the professional astronomer and not specifically to the huge number of amateurs. Now who do we know that is good at managing the delivery of large data sets in a seamless way to large numbers of people?

I could see Google Earth providing the ability for amateurs to share their own ccd observations of the sky via kml files, and perhaps providing an easy way for the big observatories to get ground breaking results out to the public in a way that gives context (you can see where the thing is) and an immediate impact via the images that are created. How long before we get multi-wavelength overlays?

I'm tingling with excitement.

Here is my top-tip, do a search in Google Sky for "Hubble Deep Field", now zoom. Zoom again. No, keep zooming. Eventually you will see a lot of blurry little dots. Each of these is a galaxy, far, far, away.

SCi-Foo lives on, Second Life Sessions

I didn't make it to Sci-Foo, but I eagerly lurked, and pored over the veritable explosion of blog postings that erupted onto the web, essentially from the moment that the meeting began. When Jean-Claude Bradley announced on Nature Network the idea of follow on sessions in Second Life I was pretty excited. So on Monday afternoon I brushed off my Avatar and put on the de rigeur Second Nature freebee T-Shirt.

This was my first time taking part in a meeting in Second Life, and I struggled a little with the slide presentation. I believe that there was a plan to have the meeting use the voice aspect of second life, but some people didn't have head phones and others that had, were unable to shout out past the firewall that keeps up all safe. The result was that if you could type fast you were able to get more information across (a good motivation to take pick up with the typing lesson tutor app that has sat unused for about two years on my computer). My colleague Hilary Spencer was a little late for her presentation. She told me later that the reason was that when she logged into second life she discovered that her Avatar was wearing a corset and lace-up jeans, and needed to make a quick change before a professional meeting, mind you, JC was a furry cat, so it is clear that this is one area where content is far more important than appearance.

It was very nice to get the feeling that people were coming together from many different places, and Bertalan Meskó has a very nice report on the meeting. One of the advantages of being type only is that Jean-Claude has been able to post a full transcript of the session on his blog.

Mostly the meeting revolved around a Q&A session about a number of tools for scientists. There was a short moment at the beginning of the meeting where it looked like there might be a discussion on the definitions of Open Science, and this has seeded into a number of other discussions.
Richard Ackerman makes some interesting points in his post open science and the impact factory where he proposes that people interested in coming to a definition should use the nodal point wiki.
These thoughts are mirrored by Bill Hooker here.

At the end there was a quick vote on what the next session should be about and we settled on
Medicine and Web2.0 Aug 27 at 16:00 GMT with a continuation of Definitions for Open Science Sept 4 16:00 GMT.

August 21, 2007

Amazon: A New Kind of Publisher

While most of the attention and ire of the publishing industry seems to be trained on Google these days, the most clueful colleagues I speak with appear unanimous in the view that the biggest threat to their livelihoods is actually Amazon. I think they're right, as this recent announcement shows. It may just prove to be the publishing news of the decade.

To see why, consider first that the publishing industry — from agents and publishers to wholesalers and retailers — essentially exists to connect people who have interesting content to those who are willing to pay for it. In general the system works reasonably well, but in the age of the web all those layers look mighty inefficient and ripe for disintermediation. Furthermore, as Clay Shirky and Chris Anderson have pointed out, infinite online shelf space and smart computer algorithms are turning traditional publishing approaches on their heads: "filter then publish" has become "publish then filter".

This new "post-filter" paradigm (as Chris puts it) is potentially a much more efficient and effective way of carrying out the matchmaking role currently fulfilled by the publishing industry (among others). The snag is that to make it work you need a large set of data about individual and collective tastes (so that, based on analysis of collective behaviour, you can accurately predict which individuals will be interested in a particular item). Google achieved this in search by studying the link patterns of the web; Apple achieved it in music by getting everyone to load their CDs into iTunes, and buy new tracks from the iTunes Store.

For books, of course, Amazon is the owner of that precious data set. They know more than any other organisation about my reading habits — heck, they probably know more than I do about my reading habits. In contrast, Waterstones knows nothing at all about my preferences even though I must have bought at least as many books there over the years as I have at Amazon. The other players in the current publishing chain know even less.

Which puts them all in a frighteningly fragile position. So far Amazon have employed their user data mainly in improving search results and delivering personalized recommendations, all with a view simply to selling more stuff delivered through traditional channels. But in principle there's no reason why they shouldn't take a leaf out of Lulu's book by accepting content direct from authors, promoting it direct to users who are likely to have an interest, fulfilling any orders using print-on-demand, and sending a cut of the revenue back to the original authors. Fast, efficient — and not a traditional intermediary in sight.

Which brings us back to that announcement:

CreateSpace, part of the Amazon.com, Inc. group of companies (NASDAQ: AMZN), today announced the launch of a new online Books on Demand service. Also announced today, the company is no longer charging setup fees for books, audio CDs and DVDs. Authors, filmmakers and musicians can now offer their works to millions of customers on Amazon.com, CreateSpace.com and via their own free customizable eStore without any inventory, setup fees or minimum orders.

<snip!>

CreateSpace books sold on Amazon.com are printed on demand, display "in stock" availability on Amazon.com and can be shipped within 24 hours from when they are ordered. The books are automatically eligible for Search Inside!™, Amazon Prime™, Super Saver Shipping™ and other Amazon.com programs as well.

So there you have it: Amazon becomes the ultimate clearing house for books of all kinds (and much else besides), with none of the traditional middlemen getting a look in. Genius. If you're an agent, publisher, wholesaler or retailer of books and you haven't just soiled your undies then you don't understand what's going on.

August 18, 2007

Groups on Scintilla

Perhaps you already use Scintilla to keep track of papers, news items and blog posts that discuss your particular area of interest. Perhaps you've never heard of Scintilla before (in which case you should give it a go).

Maybe, though, you tried it once but couldn't work out why it'd be worth visiting regularly when you've already got PubMed email alerts, an RSS reader and Google Blogsearch working just the way you want them to?

If that's the case then (a) congratulations, you're more web savvy than the vast majority of other scientists (or maybe just better organized) and (b) you're still missing out.

Social features have been an important part of Scintilla since the project began. We can and do work on algorithms to present you with information that we think you need or might like but right now the best sources of recommendations are people who know you - your contacts on the site.

Once you join groups and build up a network on Scintilla you start to get much more out of it. It's still a little bit harder than it should be - we're aware of that, but it's one of the drawbacks of a release early, release often approach we took. Over time you'll see better integration with places like Nature Network and Connotea.

At the moment most of the groups on Scintilla are communities of interest - Open Science and Best of the Blogs are two popular groups to which members recommend relevant blog posts, articles and papers.

Alternatively check out some pretty pictures collected from blog posts - the one above is of Mira's recently discovered 'tail' - or the many ways that we're all going to die.

Do have a browse around. It's more work orientated than Facebook and less boring than an eTOC (ah, excepting Nature's, of course).

August 17, 2007

BASE - Bielefeld Academic Search Engine

Here's a new scientific search engine that let's you choose to search only freely available content: http://base.ub.uni-bielefeld.de/index_english.html. It mostly indexes material published using OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting), but it includes some standard websites too. There's a list of the content providers -- which includes a number of publishers.

Science in science blogs


Should science bloggers distinguish between posts that cover peer reviewed research and those that focus on more light hearted matters like quizzes, photos of other bloggers down the pub and science in the mainstream media?

Dave Munger at Cognitive Daily thinks so. Last week he suggested creating an icon that bloggers could add to 'serious' posts that would help identify them as being academic in nature. He posed some interesting follow-up questions: how would you control usage? What is the definition of 'serious' in this context?

The idea was well recieved and now there's a web site at BPR3.org where Dave and others are soliciting comments, ironing out the details and deciding exactly how the scheme is going to work.

The idea so far in bulletpoints (from BPR3.org):

  • The BPR3 icon will represent, most importantly, a blog post that thoughtfully discusses peer-reviewed research.
  • All research should be formally cited according to the requirements of the discipline within which it falls, and linked when possible.
  • The post should make it clear when it is discussing research or ideas that are not peer reviewed.
  • The poster should have carefully read all research cited.
  • The icon should link back to the BPR3.org site in the manner we specify (this will depend on the method we choose for aggregating posts).

I think this is an excellent idea (in principle, anyway) and something that publishers should support. Blog trackbacks are a good complement to user comments on papers, but it's hard to tell (in an automated way) if a post is discussing a paper in detail or just mentioning it in the passing - by checking to see if trackbacked pages contain the icon we could filter out irrelevant posts.

Distinguishing research from everyday blogging was one of the ideas behind Postgenomic. By adding some code to your html when linking to a paper - "rev='review'", to be precise - you can tell Postgenomic that your post is reviewing that paper. This never really took off, partly because... well, it's a bit boring to add markup to your posts by hand and see no immediate return. An icon could, at least, grab people's imagination, though you'd want machine readable metadata embedded somewhere appropriate too...

Events on Second Nature (2)

Correction to my previous post: Phil will now be coming into Second Nature to talk about his work restoring ancient DNA on Thursday 13th September, a day later than I previously said. Thanks to Andy for pointing out that Wednesday isn't generally a good day for talks because of Second Life maintenance.

Further details, including confirmation of time and SLurl, in a couple of weeks.

August 15, 2007

SciFoo: The Podcast

Just in case you haven't already ODed on SciFoo blog coverage — or perhaps you have but a touch of audio will come as welcome relief — the 16 August episode of the Nature Podcast has a great segment on the event narrated by my colleague, Adam Rutherford. Listen here. (If you're impatient then fast-forward to 21 minutes in, but don't miss the celebrity endorsement at 1:29.)

As usual, the rest of the show is well worth a listen too. And if you like what you hear, subscribe to the Nature Podcast RSS feed.

AND as an exclusive bonus feature for Nascent readers, here's the whole of Adam's interview with Tim O'Reilly. Lots here about what Foo Camp is, how it came about, and why we do it.

Kudos to Tim for being one of the few Americans to be able to pronounce my name. I'll correct him on only one point: the original concept for SciFoo came out of a discussion not between him and me, but between him and Linda Stone. I merely agreed with them that it was a great idea... then went without sleep for a couple of months while I emailed people to urge them to attend.

August 14, 2007

ETech Call For Participation

My favourite conference of the year, ETech (OK, perhaps I'm excluding certain invitation-only 'unconferences') has opened its CFP. Wey hey! And there's more: Due to what I can only assume was a rare administrative mix-up at O'Reilly, I've also found myself on this year's program committee. w00t!

ETech is one of only a couple of events that I make sure to attend every year. There's no better way to find out what's over the technological horizon, and to hang out with the people who are inventing the future. I'm particularly delighted that it looks as if ETech 2008 will have more science-related stuff than ever before, from energy and materials science to neuroscience and synthetic biology. So if you're doing mindbendingly cool stuff in these or other areas (or you know someone who is) then please tell us about it by 17 September. If you want to make an informal enquiry then you can email me: t.dot.hannay.at.nature.dot.com.

ETech program chair, Brady Forrest, has more details.

August 13, 2007

Events on Second Nature

On a different note, I'm also delighted to announce that the first event in the Second Nature Lecture series has been scheduled!

This is going to be a series of events held on Second Nature, where scientists of all types will come along and talk about their work to anyone who wants to listen. We haven't hosted an event in Second Life before, but I have attended several which have been really interesting - short informal talks, followed by a discussion seem to be perfect for Second Life where the audience is perhaps less inhibited than in real world meetings, so discussion flows more freely.

Talks will start in September, and be perhaps approximately fortnightly - we have a number of speakers already lined up, but if you're a scientist and fancy giving it a go, by all means let me know!

More details nearer the time, but keep September 12th, 6pm GMT free for a discussion with Phil Holliger from the Medical Research Council Molecular Biology Lab in Cambridge. Phil works with ancient DNA: DNA samples retrieved from specimens of forensic, paleontological or archaeological interest. DNA naturally degrades over time, making it very difficult to amplify and analyse, and Phil will be talking about a new way to more accurately repair the damage, which he recently tried out on a 60,000 year-old cave bear.

To whet your appetite, Phil will be followed in the autumn by a host of other speakers, covering topics including a new affordable drugs initiative, scientific patents, the foraging strategies of cormorants and how flooding cut Britain off from the continent and turned it into an island. All the events will be free, open to anyone, and no specialist knowledge is required, so if you don't have a Second Life account, now's the perfect time to get one. Go to Second Life to get started, and by all means drop me a line if you need any help - it can be a bit daunting at first, but once you get over the initial hurdle, all suddenly becomes much clearer.

Welcome to new Second Nature residents

Shamefully, nearly a whole summer has passed since I last reported on Second Nature, and far too much has happened in that time to cover everything, but I do have some really exciting new developments to report. Firstly, I have started a Second Life blog on Nature Network, to talk about all things SL and Second Nature without distracting from the "real" world elsewhere.

The biggest event this summer has been the acquisition of our third island, which has been entirely given over to setting up an SLEcosystem, created by the Ecosystem Working Group. In their own words, the EWG is "a small group of dedicated scripters and builders committed to the notion of developing a simple, but functioning, ecosystem within the confines of the Second Life platform. We are attempting to develop common protocols for animal-animal interaction and believe our experiments here can enrich Second Life and the scientific understanding of complex systems themselves"

The SLEcosystem has been living at its original home, Terminus, since its inception, and has already attracted a lot of interest in Second Life and beyond. Terminus lives on: the version living on Second Nature is independent, and features high prim organisms which (in some cases!) look not dissimilar to real-world creatures. The whole project is ongoing and fully open sourced, so anyone is welcome to contribute to it: our ecosystem is being continually being updated as members of the group create new organisms or update the existing ones. Species on Second Nature currently include the basic plantform, the Cannon Plant, the herbivorous gridlice and gridbirds and the predator, a flourescent blue Flying Fox. The chief guardian of our ecosystem is Luciftias Neurocam in SL, a neuroscientist from Drexel, and he will be coming to Second Nature to give a talk and demo of the sim in the next few weeks: details to follow. In the meantime, to see the ecosystem and get more information, come and see it. Tip: watch the large green blobs - if you're lucky, you might see a gridlouse hatch!

The other major new inhabitants of our land this summer are CASA, the Centre for Advanced Spatial Analysis from UCL. They have colonised a large region of the sky about Second Nature and are using it for experimenting with city modelling - a vast array of 3D graphics, maps and other things have magically appeared: see more about what they're doing on their Digital Urban blog

Last but not least, for SciFoo attendees, Jean-Claude Bradley has set up an area on Second Nature called "SciFoo Lives On", where attendees can put any slides and other materials and where SciFoo can be continued with virtual sessions: see his blog for more details.

August 10, 2007

Gateway updates: Killing the messenger and other stories

Amongst the fresh additions to the Signaling Gateway and Neuroscience Gateway this week:

Killing the messenger: a neuroprotection mechanism induced during multiple sclerosis is attacked by the body's own immune system

Put a cork in it: a new technique for tightly controlling the expression of toxic proteins using a Lac-Tet-RNAi system.

Lung cancer: mutations that provoke aggressive, metastatic non-small cell lung cancer.

Molecule of the Week: Ntcp, the bile acid co-transporting protein.

Plus much more ...

August 08, 2007

Science Foo Camp 2007

sci_foo_logo_sm.jpg

Everyone else is writing about it, so it's probably about time that I did too.

SciFoo '07 was wonderfully intense, mind-expanding and surreal. Organisationally, it was a bit less stressful than last year's inaugural event (at least for me), mainly because we knew it was going to work to some degree. Indeed, the success of SciFoo '06 lead to a fair amount of anticipation this year, best described in words by Jonathan Eisen and in pictures by Pierre Lindenbaum. (See also Pierre's cartoons from the event itself.)

Such is the variety and (relative) anarchy of the event that there's no such thing as one SciFoo experience, only 200+ personal experiences. To give a feel of the occasion, read Henry Gee's opening post and have a look at Bora's photos, photos, and more photos. What follows is a very brief description of a few of my experiences.

We kicked off the event on Friday evening. After everyone had introduced themselves in (more or less) three words, we unleashed them on the schedule boards, which quickly filled up with wide range of proposed sessions. Then we asked a few of the attendees to give talks in front the whole group of around 200 people. One of them was Charles Simonyi, who described his recent trip into space with a short video and a great Q&A session. His partner, Martha Stewart also explained about the food that she helped to arrange for the voyage.

On Saturday morning, after thinking about it for quite a while, I skipped the publishing-related topics. I had come to SciFoo for something new, so I went to a lot of physics sessions. The prospect of learning about quantum computing, the nature of time, and multiverses from the likes of Frank Wilczek, Martin Rees, Lee Smolin and Neal Stephenson was simply irresistible. But perhaps my favourite session of all was George Dyson's talk about the amazing human story of Kurt Gödel's journey from Europe to America in 1940, his mental health problems, and the attempt by the US government, once he got to the country, to draft him into the military. I always enjoy George's talks, but what made this one extra-special was the presence of his father, Freeman, who worked alongside Gödel and other great names at the Institute for Advanced Study in Princeton. He was full of anecdotes which served to remind us that science, for all it's dry rationalism, is at heart an intensely messy, human activity. I also really enjoyed Lincoln Stein's talk about Jim Watson's genome, Pete Worden's session about avoiding collisions with near-earth asteroids, and Paul Sereno's display of amazing dinosaur bones.

Many people seemed to have a blast. I did too. What other event brings together people like Eric Lander, Lincoln Stein, Frank Wilczek, Andy Fire, Martin Rees, Freeman Dyson, Lee Smolin, Paul Sereno, PZ Myers, Bjorn Lomborg, Eric Drexler, Charles Simonyi, Danny Hillis, Larry Page, Jim Hendler, Esther Dyson, Jeff Hawkins, Sergey Brin, Vinod Khosla, Carl Djerassi, Greg Bear, Neal Stephenson, Kim Stanley Robinson, James Randi and Martha Stewart? And that's not to mention the many great young scientists who are going to be the superstars of their generation. To sum up, I'll turn to Henry Gee again: "Someday all conferences will be like this."

(There are more photos on Flickr (here and here and here), more blogs on Technorati, and yet more links of various kinds on Connotea.)

August 02, 2007

Planet Nature

Planet Nature is an aggregator of all the NPG blogs that post original content. It's powered by Venus, an actively developed branch of the Planet software.

Venus uses a feed parser to standardise all the feeds into Atom, then XSL templates to transform those feeds into XHTML, Atom, OPML, FOAF and whatever else is needed.

Thanks to Sam Ruby, Venus now supports CSV files as input for the list of subscriptions to aggregate. This means that we can use Google Spreadsheets to collaboratively maintain the list of subscriptions, giving editing permission to whoever needs to be able to edit the list.

With the addition of an extra XSL template, Venus can also produce a simple HTML file listing just the subscriptions. Google can then use that file to produce a Custom Search Engine on the fly, with searches limited to sites linked from the original page. You can see this in the sidebar on Planet Nature.

Also in the sidebar is a Filter based on Yahoo! Pipes. This pipe takes all of the feeds aggregated by Planet Nature and searches them for the given keywords, creating an RSS or JSON feed that you can use to be notified when new entries are posted to NPG blogs matching your chosen keywords.

"Nascent Web publishing efforts have their genesis in a burning need to say something, but their ultimate success comes from people wanting to listen, needing to hear each other’s voices, and answering in kind."
Rick Levine
The Cluetrain Manifesto

Subscribe

Subscribe to this blog's feeds:

[What is this?]

Recent Comments

Powered by
Movable Type 3.2