To its huge credit, the British Library (just up the road from Nature’s London office) has begun to host a series of discussion-oriented meetings about science called, appropriately enough, TalkScience.
At last week’s meeting, provocatively entitled ‘Scientific Researchers and Web 2.0: Social Not Working?’, there was a great turnout, not only at the BL but also on Second Nature, our island in Second Life, which was hooked up to the real world with audio and video connections. (Thanks to my colleagues, Alf Eaton and Jo Scott, and to BL staff, for making this happen.)
I gave a brief opening talk, my notes for which are reproduced below. There’s further discussion, from both before and after the event, on Nature Network.
Scientific Researchers and Web 2.0: Social Not Working?
My name is Timo Hannay. I work at Nature Publishing Group, where I’m publishing director for Nature.com. I’m a neurophysiologist by training and a web geek by inclination, so I’m very interested in the ways in which scientists can use the web – and other advances in information technology – to accelerate the pace of discovery.
“Web 2.0 tools are beginning to change the shape of scientific debate.”
This is not my opinion, it’s an incontrovertible fact. I know this because I read it in the current issue of The Economist — you’ll find it on page 96.
But are they really? What has been their impact so far – and what potential do they hold for the future?
Before we start, I think it’s worth reminding ourselves what “Web 2.0” means.
Web 2.0 is a term coined by O’Reilly Media, a technology publisher and event organiser, and is in fact the name of one of their conferences. Most famously, the concept has been articulated by their founder, Tim O’Reilly — notably in his 2005 article, What Is Web 2.0?, which you can find online and is still well worth reading.
Although the term ‘Web 2.0’ is often synonymous with the kind of user participation seen on social networking sites and blogs, Tim’s original vision was much more multifaceted and sophisticated than that. He examined those companies and approaches that had survived the dot-com crash at the turn of the millennium, and attempted to draw some general lessons for what does and doesn’t work well online.
For example, he contrasted DoubleClick with Google AdSense (somewhat ironic now that Google owns DoubleClick, though I supposed in some ways that’s the ultimate demonstration of the triumph of Google’s ad model), taxonomies with folksonomies, content management systems with wikis, Akamai with BitTorrent, Britannica with Wikipedia, and publishing with participation.
Software development concepts like ‘perpetual beta’ and technologies like AJAX were part of the mix. But first among equals in this smorgasbord of ideas was what Tim called ‘harnessing collective intelligence’. So perhaps it is not so unreasonable that Web 2.0 should come to mean user participation.
The Wisdom of Crowds
‘The wisdom of crowds’ is the term most often used to convey the great collaborative potential of the web – and notable successes like Wikipedia. It was made famous as a phrase by James Surowiecki’s book of the same name, published in 2004, the same year of the first Web 2.0 conference.
The book starts with an anecdote about a Victorian scientist, polymath and eugenicist called Francis Galton (who happened to be a half-cousin of Charles Darwin). Visiting a country fair, he came across a competition to guess the weight of an ox – after it had been slaughtered and dressed. To his amazement, the mean estimate of all the entries was within 0.01% of the correct answer.
There’s a similar effect going on in markets of all kinds, including the stock market: the judgement of large numbers of people are aggregated and provide the best means we’ve discovered so far of setting a price for a particular commodity.
September 2008 might seem like an odd time to be pointing to the stock market as an example of the ‘wisdom of crowds’. And there’s the rub. An older – and in the scheme of things even more famous an influential book than The Wisdom of Crowds is Charles Mackay’s 1841 classic, Extraordinary Popular Delusions and the Madness of Crowds. This describes lots of examples – like the infamous South Sea Bubble – where popular sentiment has not just been wrong but laughably, hideously, outrageously wrong. Crowds, like people, are sometimes wise and sometime mad.
I also think it’s important to point out that the concept of crowd wisdom comes in at least two types. One, as we’ve seen, is the aggregation of lots of people’s opinions or actions. The other is where a task is offered to a large number of people but ultimately carried out by just a few. This is often known as ‘crowdsourcing’ and the central thesis is that those self-selected few who voluntarily carry out the task are likely to be well-suited in terms of their qualifications and motivations.
Though it’s a subject of debate, I think this is basically the way that Wikipedia works – most articles are authored by a relatively small number of people with a particular interest in that topic. And [its] detractors notwithstanding, Wikipedia does work. As Clay Shirky pointed out in his now-famous speech on the concept of ‘cognitive surplus’ at Web 2.0 Expo, if we could only harness all the time that people waste watching, or even just voting on, American Idol then we could repeat the success of Wikipedia many times over.
But that raises a question about the underlying assumption of this approach. Not to put too fine a point on it, are the people who vote on American Idol the people we want writing Wikipedia? To see – or should I say hear – the limits of the crowdsourcing model you only have to listen – if you can bear it – to the toe-curling quasi-torture that is the late-night radio phone-in show. People with an opinion and time on their hands aren’t always the people with valuable views.
As with most things in life, we need to strike a balance. In this case it’s a balance between (1) closing yourself down to potentially useful contributions and (2) opening yourself up to chaos. The monumental success of sites like Wikipedia shouldn’t blind us to the fact that this is hard. And in this sense, I think we should welcome ongoing experiments like Larry Sanger’s Citizendium and Google’s Knol that are exploring alternative approaches.
[Here I gave a brief descriptions of Nature‘s open peer-review experiment, which is an example of crowdsourcing in science publishing.]
The Long Tail
Another quintessentially Web 2.0 term is ‘the Long Tail‘. Like ‘the wisdom of crowds’ and ‘Web 2.0’ itself, it dates from 2004 and was coined by Chris Anderson, editor of Wired magazine.
The core idea is that in an online world no longer constrained by physical storage space, we can make everything available. If any given item is only of interest to one or two people – or even nobody at all – then it doesn’t matter. On the contrary, it’s a waste of effort to separate the wheat from the chaff in advance.
I see a link here with the ‘open notebook science‘ movement being spearheaded by people like chemist Jean-Claude Bradley, biologist Cameron Neylon and physicist Garrett Lisi. Not many people will be interested in the details of an experiment that was attempting to synthesize one obscure organic compound only to produce another obscure organic compound. But some might, and if the cost of putting it online is close to zero then why not share it?
Well, there are at least two big reasons why not:
- First, more doesn’t necessarily mean better. How is anyone ever going to make sense of, or even navigate, this blizzard of semi-structured information? The answer, of course, is that humans can’t do this unaided – they need software tools to help them. As Clay Shirky puts it: “It’s not information overload, it’s filter failure.” Google and other search engines help a lot, but they are only a beginning.
- Second, even if the direct financial cost of sharing this information is low, the cost in terms of scooped findings, rejected papers and grant applications, and perhaps even diminished reputation could be very high. It’s a complicated topic, and there are no quick or easy solutions, so I defer this discussion till later – and to the wisdom of this particular crowd. But it’s also an important topic. In my opinion, it’s the single biggest reason why many researchers are indifferent – or even resistant – the idea of open collaboration and sharing in general.
Blogs and wikis
This idea of open sharing of information and ideas – and resistance to it – also applies to blogs.
“Science blogging is growing” I confidently wrote in an essay a few months ago. Then, like any good scientist, I went in search of evidence to support my prejudice. But I couldn’t find any beyond the anecdotal. For a year or more, estimates of the number of blogs by scientists about science seem to have been stuck at about 1,500 (give or take). Services such as Alexa and Compete.com (if they can be believed) show traffic to sites like ScienceBlogs.com to have been flat for the last few months. If anyone has good evidence that scientific blogging is growing then I’d love to hear about it. But for now I’ve had to conclude that it isn’t.
Why not? A constant refrain I hear from scientists, including my editorial colleagues, is “Where will I ever find time to blog?”. My usual response is to ask where they ever found time to email. Far from taking up time that people don’t have, any new approach, in order to be successful, has to make our work more productive – and ideally more pleasurable too. Blogging (or whatever activity might fill a similar niche in future) can enable much more effective many-to-many communication than email will ever allow. And it ought to mean that, for example, face-to-face meetings become more productive because everyone arrives better informed.
But it’s not up to the doubters to ‘get it’, it is up to those of us who support these developments to demonstrate their value. And if we can’t then they don’t deserve to be adopted and we don’t deserve to be heard.
Blogging is a great example of one of the very best things about the web: that our every utterance can be indexed for anyone to find, distributed to every corner of the world, and archived for eternity – all at virtually zero cost. The problem is, to many people these are also the very worst things about the web.
There is often a conflict between personal incentives and the collective good. Once again we come face to face once more with the incentive problem. It’s sad, but most scientists don’t publish in order to share results with their peers, they do so in order to secure grant funding and promotions. We know this because when we provide ways of sharing information that do not affect their likelihood of getting funding or promotions – such as preprint servers for biologists – most don’t don’t use them.
[Here I mentioned the recent story about slides at a physics conference that were photographed and the data in them reused. Among other things, I think this demonstrates that scientists show results at conferences not to share the data with their peers but to tell their peers that they have the data, which is very different.]
As Michael Nielsen has pointed out, a system designed to encourage scientists to share their results through scholarly journals is now serving to discourage them from pursuing the same goal by other means.
One thing that might help would be if more senior scientists were to take blogging seriously, and even to blog themselves.
The web means that for all practical purposes the minimum publishable unit has fallen to literally one bit of information (in the formal sense of that term). But the culture and organisation of science hasn’t caught up. We need ways to track these contributions, apportioning credit accordingly – and, as far as possible, eliminating spammers and shills.
This is difficult to do, but there is progress. One of the most difficult areas is wikis because for collaborative prose it is intrinsically hard to determine who contributed which piece of text, never mind how valuable it was. But even here we can see some progress. WikiGenes is a recent effort to address this: it captures authorship information for every word, and includes the ability for authors to view and rate each other’s contributions.
I find chemistry [a] very interesting case study, though I’m not a chemist myself. It’s often said that modern science was born when alchemy became chemistry through the open exchange of ideas between its participants. But in many ways it’s a very conservative discipline, with strongly entrenched ways of doing things and special interests.
But a small number of chemists are leading not only their own discipline but all of science with initiatives in areas such as open lab notebooks (I’ve already mentioned Jean-Claude Bradley and Cameron Neylon), data and identifier standards (Peter Murray-Rust in Cambridge), community annotation (Tony Williams at ChemSpider), and even use of Second Life (Jean-Claude again).
But can these few ultimately influence the many?
Personally, I’m optimistic about the potential of the web to greatly improve the productivity – and joy – of doing science. I also think it can help to break down barriers between disciplines, and between science and the rest of society. That’s why I’ve devoted my recent professional life to the pursuit of turning this into a reality.
But I’m less optimistic about the inevitability of this potential being fully realised, at least in anything less than a generational timescale. For every scientist who sees it as self-evident that they should be using these tools, or promoting open information-sharing, there are dozens who just don’t see the point. For every publisher or librarian who ‘gets it’ there are many who don’t – at least not fully and not yet.
Changing behaviours and expectations is difficult at the of best times – it is too easy to overlook the hundreds of companies that fail for every one, like Facebook or Google, that changes the landscape. In a conservative establishment like science, it’s harder still. In some ways science – as an continual, collaborative, global endeavour – is the ultimate wiki. But this analogy misleads people into assuming that adoption of new tools and approaches by scientists is a foregone conclusion. It’s not.
So will ‘early adopters’ scientists lead the rest along with them? Or will they remain as outliers? Or will they be forced back into the mainstream?
All of the above. Progress is hard, almost by definition. Often the reason is that many things have to happen to enable it: technological innovation; legal, bureaucratic and fiscal change; and behavioural adaptation. There will be many false starts, wrong turns and dead ends. As in scientific research, most ‘experiments’ in new modes of scientific communication will fail. But fortunately there will always be people testing the limits. And gradually – sometimes very gradually – the scientific mainstream will evolve.
Sometimes we get a more acute sense of how rapidly we’re moving by looking backwards rather than forwards – like looking out of the back window of a car. And just look how far we have come in less than two decades since the emergence of the first website, laughably basic by today’s standards. Inevitably, inexorably that progress is continuing. To steal a famous closing sentence:
“[F]rom so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved.”
I just hope that all this happens at a rate much quicker than the kind of evolution that Charles Darwin was talking about.
Preview image by JL2003