« September 2007 | Main | November 2007 »

October 31, 2007

Tim O'Reilly visits Nature

IMG_4451_300dpi.jpg

Last Friday Tim O'Reilly dropped in on his way through london, and give a web seminar at Nature. We have been running these web seminars in Nature for about two years. They kicked off back then with talks from Jimmy Wales and David Weinberger and it was great to have Tim come in at the almost the two year mark to hear his take on 2.0 and beyond.


Instead of giving a presentation Tim just opened the floor to questions. and I've cobbled together an account here.


Everyone agrees that the internet is a transformative technology, and as the cost in finding information drops then businesses that rely on information must change in response to this. It's also pretty clear that there are many competing business models out there. Aside from being a great technologist and diviner of trends in the geek space, the core of Tim's business is a publishing business, principally book publishing, so the anticipation was that his insights might be particularly relevant to NPG.

I'll try to cover the guts of the talk below, but in deference to a promise that I recently made to my flat-mate I am going to avoid using the term 'paradigm shift' (exercise to the reader, identify the paradigm shits described in this blog post, bonus points; predict upcoming paradigm shifts, and let me know). For clarity when referring to things that Tim O'Reilly said or did personally I'll refer to him as Tim and likewise his company as O'Reilly.

Timo opened the questions by saying that the theme of the recent web 2.0 conference was 'the edge of the network', yet web 2.0, it seems, is becoming corporate and mainstream. He asked Tim if he would like to talk about what interesting things are on the edge?

Before answering the question directly Tim decided to give a background to web 2.0. It's true that 2.0 is becoming more mainstream, for instance Steve Balmer was at this year's conference, but Tim pointed out that he has been talking about these ideas for about 10 years. Back then he gave a talk on hardware, software, and what he called at the time infoware. The PC represented a commodotisation of hardware, and when that happened the value in computing, which previously resided with the makers of mainframes, didn't disappear, it migrated from hardware to software. Famously IBM missed this shift and MicroSoft capitalized. In the same way that the value moved from hardware to the operating system with the advent of the PC, the value, moved to networked applications in the 90's with the advent of the internet. The web represented the new layer on top of which new applications were built. This really hit home to Tim when someone told him that they had just gone out and bought a computer in order to use Amazon.


Excitement in the internet drove a bubble that burst in about 2000, but in about 2003 it was noticed that there was a resurgence of interest in the internet with some common features amongst businesses that had survived the bubble. There seemed to be a renaissance of the web, and O'Reilly decided to host a conference highlighting this phenomena. Dale Dougherty of O'Reilly coined the term Web 2.0 for the name of the conference. You can see a bit of history here.
It seemed that one of the key features of these sites is that they gained their value from network effects. The more people that used a site the more valuable the site became. They were using the network as a platform. The canonical examples of this are eBay and Google. With eBay the users are the value. What google realised is that the behavior of users on the internet could tell you about the value of sites. Pagerank takes the behavior of people using the internet to tell you things about the structure of the internet. Many people linking to a site indicate that the site might be more interesting than a site without so many links. Google's ad auction is another example of using collective intelligence. Google - doesn't just sell an ad slot to the highest bidder, they maximize revenue based on click rates and costs. If a lower value ad is going to get more clicks the total return is going to be higher. This requires real time analysis of what users were going to do.


At the back end of these sites are massive data centres with large amounts of information about user behavior. This is not new. Credit Card companies and telephone operators have been doing this a long tome before the internet was mainstream, but they have failed to convert these assets into user facing services. At the moment there is no smart phone address book. Phone bills are almost useless lists of calls with no smart filtering of behavior. Credit Card companies and phone companies have the data don't seem to have cluefulness on how to extract extra value from the data.


If you are running a business a question to ask is what assets you have that are generated by users, and how do you turn those assets into real time services for users? This is the heart of user collaborative intelligence.


If we look at Nature what is the data inventory that we have? We should know as much about who the good scientists are as anyone, and this is very valuable.


Addressing the question of what is on the edge, i.e. what's the next value creating platform that will result in changes to the current web eco-system Tim pointed strongly at applications that will integrate with sensors in the real world and not just be created by people typing on keyboards. Microsoft's Photosynth does this by harnessing the huge number of cameras that are out in the wild. Norwich Union has a gps enabled pay as you drive insurance. Mobile phones could be a great platform on which to build, but at the moment mobile phone companies, like credit card companies, don't seem to have a clue.


Wesabe and Mint are two start-ups in the states that are mining the spending behavior of people who use their services and are pointing the way towards connecting real world behavior with collective intelligence.


Immi with Nielson are introducing a new way of generating TV ratings. They have a special mobile phone that takes a sound sample every 30 seconds or so of the ambient noise in it's vicinity and matches what it hears with cues that it knows about from tv adverts.


Genetic data becoming available is another example of where uses for collective intelligence will feature. (This made the cover of Nature last week)

Timo asked about publishing. The open access movement is seen as part of web2.0. At point of use most web apps are free, but how does this fit with publishers business models? How does O'Reilly walk the tightrope between being a publishing house and it's vision of trying to change the world?

Tim said that the market will figure out a way to pay for the things that are truly valuable. You start with the faith that these things will get paid for, you try things, watch what other people are doing and try to catch the wave.


The O'Reilly core book business remains challenged. It dropped to half between 2000 and 2003. Tim takes the view that whether we (O'Reilly and Nature) survive doesn't matter, as long as what we are doing survives. When the crunch came in O'Reilly retrenched on their core business. They asked what was it that they were really trying to do, their mission statement is "changing the world by spreading the knowledge of innovators". They moved into conference, which has been very successful for them, and new business models such as safari books online which is a core subscription service.


Tim says that you have to look and ask what is the core function that you preform, for Nature it's a curation, a seal of approval, a conferring of status.


O'Reilly is seriously looking at how they can do that for their authors. Asking questions like what are the benefits of being an O'Reilly author? What can we do for our authors, so that it is better for them to publish there?

Q James: how do you ad value for readers?

It's still an untold story, O'Reilly tends to listen to readers, a lot of what O'Reilly do is watching the earliest adopters, and looking for things that are about to be adopted by a lot of people. In some sense the service is to bring things to people that they would not otherwise have found. The service is a kind of storytelling For example make magazine. One of the patterns they saw was a new engagement with hardware, tried some books on hardware hacking, and launched the magazine. They also launched the Maker Faire, a community gathering for people who like to make things. It's a very broad tent, the maker fare was like a county fair, a normal fair has pigs, this one has robots. There are many people in their back yards making things, like the golden age of mechanics but now with robots and sensors. They are engaged with a new craft. In one pavilion there is a swaporamarama, they have a bunch of people with sewing machines and silkscreening In the next tent is a bunch of pc geeks cobbling together a supercomputer. These are all people who don't want to buy things, but to build things. computing is infusing the physical world. Part of being a curator is storytelling by bringing things together. A great book to read is Cory Doctorow's down and out in the magic kingdom. The core idea is how do you create a vision that other people will want to follow?

Timo asks Tim to talk about foo camp

Tim gives the background to foo camp. The original name of the get together was going to be 'foo bar', which is a geek joke. (I guess I got my geek props for getting that joke, most of the people attending the talk were left a little blank. Then Tim mentioned another geek joke that went totally over my hear, guess I've got a bit further to go)


Tim says that in 1998 they organized the open source summit, which was about people meeting people.
He realised that at that time many people in the Open Source community were working with very much the same goals, but that they had never met each other. Convening is part of telling a story, and the spirit behind foo camp was just to get a whole bunch of really great and interesting people together and see what happened. The initial contact between Tim and Timo happened when O'Reilly was doing a bioinformatics conference, and later Sci-Foo resulted from this contact. The idea was to recreated the foo camp experience with scientists.

Chris asked about sensor driven software and the ethos of the web. When 'the man' gets involved, there is always tension. On the web where companies get involved in community building what are the responsibilities of these, specifically with the data that they are collecting?

Tim said that if you look at the history of this, then it goes in cycles. Every industry goes through this. There is a creative anarchy, then some groups start to dominate. They move to capturing more value than they create. The computer industry was a very exciting place, and then it became boring, because it became consolidated


Bill Gates was a visionary, one of the greatest visionaries of the century. His vision was the idea of a personal computer, a computer in every home. Then Microsoft started eating their children. They were saved because a bunch of people in the wild were doing the internet, then this got stale in it's first incarnation, then there was a new crop of people with a fresh approach


There is going to be a lot of consolidation, the man will take over (he might be idealistic like google), it is going to get a lot more boring. The interesting question is what will happen when google's growth slows down?


But you have to have a belief in people's ability to find new things. There are going to be a lot new areas coming out of science, one thing we can do is to help to birth the future.

Timo askes: should we worry about google and the privacy of information?

Tim does worry, but sees that people are adapting, an example is the facebook feed, There was initial horror, but then people adapted. There will be a trade off, you can't rely on security by obscurity any more as demonstrated by spock person search. He sees this in the uk where there are more surveillance cameras than almost anywhere else.


He watched one blog storm about a council dealing with people putting up cancelled stamps on posters. People on the blog were saying that they should use the surveillance cameras to find out who was doing this.


In another example a woman found her house had been covered in toilet paper. She went around to the local stores and demanded that they show her their cctv footage until she found one that showed a bunch of kids buying loads of toilet paper ...


The folks at google gather a lot of data, but so do Credit card companies, phone companies and governments. There will need to be some amount of regulation, and adaptation, but more adaptation than regulation. Society adapts.

Timo: will publishers need to adapt to google index print as well as search content?

Tim supports what google is doing with book search. but right now it looks like google has not figured out an economic model that works for publishers. Book search should work like web search, for example the open content alliance.


It's a good thing for books to be searchable, like all markets changes happen and we will and can adapt. Tim doesn't think of himself as a a publisher, but more as someone who helps make interesting futures happen.

Data storage vs bandwidth, science is very data intensive, is the new fundamental limit now bandwidth rather than data storage?

Tim says that he is not a network operator, 'I do think peer to peer architectures are under utilized, there is a lot of unused bandwidth out there, we may hit storage limits, a few years ago i was at ibm, they said that massive storage is coming.' Google already do this with a data centre in a box, there is an analogy between tcp/ip and container shipping.


He is sure that the problem will get solved, but it will probably get solved badly. If bandwidth becomes more scare it will become more valuable and more resources will go into fixing the problem. He thinks markets do eventually work

From Dominic: is there any danger, for example, that a book with good information will not sell because people turn to free information that is not as good, leading to a drive towards mediocrity?

Tim says there is a lot of good information out there, it's not necessarily mediocre (except searching for hotels on google, which might be the first indication that they are turning evil (joke, I think?)) Tim doesn't buy that published stuff is better than online stuff. There is great stuff online and great stuff in print, he does see that things that were powerful and fashionable do change, for example the top 100 blogs, how many of them are publishing companies? This is a new way of publishing.

Timo: you have done some innovation around online publishing, e.g. safari, rough cuts is another example, can you talk about some of your experiments with online publishing?

Tim says that O'Reilly have working on this since the mid-80s when they published a version
of unix in a nutshell for hypercard, and then doc book. They didn't want to maintain a lot of documents, wanted a free reader, this led them to the web, and they have been thinking about online books for many years. The open web is a better model than a restricted access model most of the online models they didn't like because they didn't bring any of the benefits of a print book, or of the web.


Search can begin to make things interesting, this led to the decision to build safari as a channel, and it is now the 3rd largest channel, but now they are looking at what the impact of google book search is gong to be. The challenge is to think, what is it about online that is really better


Publishing on the internet, is fast cheap and out of control.


Blogs are fast cheap and out of control. They use their blog to drive traffic to some high price short print run books, eg $500 for the facebook report. You have to ask what are the synergies in moving people from one space to another. You need to experiment with a lot of models and the synergies between the models. This has allowed them to experiment with different types of books, such as short print run timely objects.


... and that was it, we ran out of time. (p.s. captions for the picture are welcome)


October 29, 2007

Second Nature Events: New Taxonomy and the Origin of Species

Leopard.JPG

Apologies for the short notice, but today we welcome Dr Shai Meiri from the NERC Centre for Population Biology to Second Nature.

Shai has a broad range of interests, and today will be talking to us about his recent publications on vertebrate taxonomy.

How many species are there in the world? How do we know if we’ve found a new one, or if it’s just a sun-kissed version of the same old leopard? And why is conservation so dependant on accurate classification?

Join us with Shai for a discussion on all these questions, an introduction to the species of the world and a broader look at conservation policy in general. No specialist knowledge required – all welcome!

Voice will be used – come along ten mintues early if you need help setting it up.

---------------------------------------------------------------------------------
Title: Are we artificially inflating number of species, and why does this matter?

Speaker: Dr Shai Meiri, Imperial College

Date: 29th October – TODAY!

Time: Midday SLT/PDT, 3pm EDT (East Coast time) 7pm GMT (London time) *

Location: Second Nature Island

Contact: Joanna Wombat

* The clocks have gone back in London, and this has left me slightly confused about time, so I hope this is right – it is definitely 7pm GMT, everything else was worked from there using the World Clock

October 26, 2007

Howtoons: Tools of Mass Construction!

booklaunch.jpg

Scifooers Nick Dragotta and Saul Griffith published their awesomely brilliant Howtoons book yesterday (US only I think). It's a comic style book aimed at kids with intructions on how to do kitchen science and make brilliant things such as a zoetrope or soda bottle submarine. Theirs was my favourite talk at Scifoo, and not just because I got to seriously geek out with an artist who actually draws ol' webhead himself. This is a must for anyone who:
1) has children between the ages of 7 and 15
2) Is a child between the ages of 7 and 15
3) Still feels like a child between the ages of 7 and 15, despite physical evidence to the contrary
4) Reads comics
5) Likes making stuff
6) Cares about science.

That's everyone, isn't it?

October 24, 2007

His Daughter's DNA

Nature cover

I'm nearly a week late with this news, but the 18 October issue of Nature has a great news feature about the amazing story of Hugh Rienhoff, his daughter and her DNA.

To my knowledge, this is the first SciFoo talk to reach the cover of Nature. I mentioned it briefly in a previous post, but it's well worth reading the professional journalistic coverage. It's in the Nature Podcast too (about half-way through, but don't miss lots of other genomics-related material in the same show). Also check out Hugh's website, MyDaughtersDNA.org, where he's trying to help others in similar situations.

I find this story simultaneously fascinating, heartrending and inspiring.

October 18, 2007

Second Nature Events: Creating an Artificial Ecosystem in Second Life

Ecosystem - CanonPlant.JPG

Moving swiftly on from last week's trials, next week in the Second Nature events series, our attention turns to Second Life itself as we welcome Luciftias Neurocam, the head of the Ecosystem Working Group

Luciftias (Dr Corey Hart, a neurobiologist from Drexel University in his spare time) is the founder of the EWG, who created the content of the popular sim Terminus. On Terminus, a group of scientists and programmers created a series of “living” creatures which survive, reproduce and interact according to defined rules. The result is a fully functioning ecosystem, which has recently moved to a new home on Second Nature.

Luciftias will be telling us about the ecosystem, what it does, results so far, and what their future ambitions are. He will take questions and then lead a tour of the ecosystem.

All welcome for what promises to be a fascinating look at one of the most interesting scientific uses of Second Life to date. Please note: the time is 2 hours earlier than our regular time.

Title: Creating an Artificial Ecosystem in Second Life
Date: Monday 22nd October
Time: 9am SLT/PDT, Midday EST, 4pm GMT, 5pm BST
Location: Second Nature Island
Contact: Joanna Wombat

Second Nature Patents talk: Rescheduled

For anyone who wasn't aware: unfortunately this week's Second Nature event, "The Importance of Patents to Scientists" was cancelled at the last minute when Second Life underwent a slight meltdown and none of us, including the speaker, Sue Scott, were able to reliably connect to Second Life. I would like to apologise to everyone who came and waited with us so patiently, and most of all Sue herself.

On a more pleasing note, Sue has kindly agreed to reschedule and a new date has been set.

The event will now be held on Monday 5th November at 11am SLT/PDT, 2pm EST, 7pm GMT. A formal reminder to be sent out nearer the time: for more information and the gory details of last week's technical nightmares see the report on Nature Network.

October 12, 2007

Second Nature Event: The Importance of Patents to Scientists

The next event in the Second Nature Lecture series is next Tuesday, 16th October. Note: new day – Tuesday, not Thursday.

Our guest is Sue Scott, a patent attorney working as a consultant to Abel & Imray in London. Before that, she acted as an advisor on patent matters to the UK government on a number of occasions, was Head of Patents at BTG and originally began life with a chemistry degree from Oxford.

Sue will talk to us all about patents in science, why patents exist and are controversial, explain the basic things all scientists need to know about patents, and attempt to dispel some of the most common misconceptions about patents.

Judging by the interest in patent matters at last night’s event about affordable drugs, I’m certain this is going to be a fascinating event, and Sue is prepared for rigorous questioning!

All welcome – please do come along! Voice will be used, so if you need any help setting up, come along a few minutes early.

---------------------------------------------------------
Title: This Importance of Patents to Scientists
Date: Tuesday 16th October
Time: 11am SLT/PDT, 2pm EST, 6pm GMT, 7pm BST
Location: Second Nature Island
Contact: Joanna Wombat

October 11, 2007

Halo 3, Will Wright and science in video games

Evolve.jpg

This week I have mostly been playing Halo 3 (not at work, though Google have consoles in their offices so maybe we could too; haven't suggested this to Timo yet). Anyway, it's pretty good and has ex-Nature staffers in it.

The main problem with video games has always been (IMHO) the deep rooted shame you feel coming out of four hour frag sessions. Four hours gone forever.... hours that could have been spent curing cancer, bonding with your children (yes, really, children - the average gamer is 33 years old) or you know, just doing something, like, constructive.

Nowadays, though, you can cure cancer with your PS3, when you go into hospital for an operation you should pick pick the surgeon with the best highscore on Sonic and psychiatrists use first person shooters to treat traumatized war veterans. Games are a force for good in science and medicine on many different levels. Edutainment is cool.

Will Wright thinks so too. Wright is the guy behind the Sims series - SimCity, The Sims etc.

Back in the early nineties SimCity - which you can now play online for free - let you plan and build virtual cities from scratch and then watch them tick along. When you got bored you could unleash terrible disasters on them (nuclear reactor meltdowns were best).

SimEarth went further. You played a godlike entity with power over an entire planet. You could tweak atmospheric composition, raise mountains, seed ecosystems and watch as your subjects formed nascent virtual civilizations in low resolution, 16 colour graphics. Plate tectonics, evolution, ecology, astronomy, the Gaia hypothesis... it was all there and it was fun (sort of). Also when you got bored you could unleash terrible disasters (meteor storms were best).

Wright's latest pair of games also have strong science elements. The next version of SimCity has an environmental impact component 'developed in conjunction with BP':

The low-carbon electricity choices and monitoring of SimCity's carbon emissions provide an entertaining, fully-integrated and accurate look at some of the causes and some of the major solutions available to combat rising levels of carbon and to help address the threat of global warming.
(via Alice Taylor)

This is a great idea - get a feel for the problem (tackle climate change in a realistic, sustainable way) by trying out different solutions for yourself.

The long awaited Spore is effectively SimEarth with the boring bits taken out:

[Spore is...] an epic journey that takes you from the origin and evolution of life through the development of civilization and technology and eventually all the way into the deepest reaches of outer space.

It looks awesome. You start off with life at a microscopic level (the 'tide pool phase') that evolves - with a little bit of help from you, we'll ignore the intelligent design connotations for now - into multicellular organisms and finally sentient creatures capable of forming a tribal society and taking over the world. Evolution, anthropology... they're the gameplay fundamentals.

Let's hope that when you can get bored you can unleash terrible disasters (bubonic plague would be best).

Lunch with Egon Willighagen

We had lunch yesterday with Egon Willighagen who in his spare time runs the Chemical Blog space, now situated at http://cb.openmolecules.net/ (running on postgenomic code).

The chat over lunch was pretty good, it turns out that Egon's favorite molecule might be Ascorbic acid. One of the topics that really animated Egon was how how to link molecules to academic papers. By this I mean for example if you do a search in google, or in some dedicated search engine, for a molecule, how does your search engine know which papers deal with this molecule. There are a couple of problems with solving this. One is that many different fields use different terminology for molecules, especially as the molecules become large, so a plain text search for the name will not get all of the papers that you might be interested in, also papers don't have semantic markup of molecules.

One solution to marking up molecules is to use an InChi (an IUPAC International Chemical Identifier). These have been championed by Peter Murray Rust and there is an extensive InChi FAQ available. The short story is that an InCHi is a character string which uniquely describes a chemical substance. From any chemical structure you can generate an InChi.

Peter has a writeup on using inCHi in blogs, and if every chemical that appeared everywhere was somehow marked up with it's InChi, or the article referring to it tagged with them then the findability problem would be solved by simple string searching.

OK great, well what's the problem? For a start there is an alternative system SMILES (which is a Simplified Molecular Input Line Entry System), a markdown for molecules if you like. There is a very good description of the syntax here and the KinasePro blog has a short comment on how many people use SMILES vs InChi. The bottom line is that more people use SMILES, but it seems easier to search Google with InChi. I'm not a chemist, but it seems from my naive stand point that the SMILES syntax seems closer to the text description of chemistry that we know from school, wheres the InChi system is more rigorous, it requires one further step of abstraction. It reminds me of the difference between LaTeX for math and MathML. MathML is a hell of a lot easier to write a parser for than LaTeX, as LaTeX can be quite expressive, however no one writes raw MathML. Scientists are lazy and that extra step of abstraction might be the reason why SMILES seems to be used more frequently at the moment.

Egon suggested as a solution that journals should require papers dealing with chemicals to include InChis. He said that every tool for drawing chemicals (standard issue for anyone writing a paper on the subject) can now output the InChi with the click of a button. Sounds reasonable, seems easy, but there are problems with this approach. I have heard a few times people say, you are Nature, you can make authors do anything in order to get a paper published so why not get them to do x. Well, for a start, that's an editorial decision, but even so, making more demands on scientists may not be the best decision when the process of publication is already pretty fraught and stressful. Even if we did this what would that gain? A small selection of the literature would be marked up, but the vast majority of journals in the area would need to follow suit in order to gain full coverage. Of course an argument that we should not do x because other people are not doing x is not what I am getting at here, but rather that this cannot be seen to be a final solution to the problem. Journals are naturally shy of any step that can delay the publication time of an article, and so I am also skeptical that we would see such obligatory requirements. Better, I think, to have this step as a voluntary one. Practically all journals allow supplementary information and I am sure all of them would accept InChi as supplementary information.

Even then one is still left with the vast existing corpora of papers that are already published. Egon points out that no one uses the literature in this area from 50 years ago, as modern techniques have advanced so far that this literature is functionally of little use. The implication here is that 50 years in the future we will only need to go back as far as today's papers. Even so there has to be a value in seeing the evolution of an idea for insertion into the literature right through to where it has led today, and Egon agreed with this.

So what can we do now to help making connections between papers and molecules? Peter Corbett, who works with Peter Murray Rust, is working on automated methods of getting computers to read chemistry papers and output semantic markup of them. Tools like this can begin to fill in the semantic blanks, both for papers from the past and for the current literature. Egon has now created rdf pages for molecules on openmolecules.net. These pages use the InChi in their structure, and now each molecule had it's own web page. Egon's pages check Connotea, and pull from Connotea co-tags of InChi tags (Here is a short description of this). If we work on this a bit more we should be able to set up a system where if you tag a paper with an InChi, that paper could appear on Egon's pages. We got quite excited about this idea yesterday and are certainly going to discuss this further. It's a small start, but a start nonetheless.

October 10, 2007

NeuroPod: the brain podcast

This week, we've launched our brain research podcast series called NeuroPod. It's presented by our very own Kerri Smith, my co-host on the Nature Podcast, and is produced in association with the Dana Foundation.

It's monthly, and the first show can be found here (but do sign up to the feed:
http://www.nature.com/neuro/podcast/rss/neuro.xml

We've got items on cognitive enhancements for making supersoldiers, what fMRI actually tells you, and the hippocampus stress and learning.

While I'm here, this week's Nature Podcast is a real humdinger even if I do say so myself. We've got jets on one of Saturn's moons, a first hand report on being a IAEA inspector, a plea for journalists to avoid geological metaphors, gene duplication in yeast evolution and how words evolve or become fixed. Where else can you here Chaucer on a science podcast. Plus chat on the very important Nobels and the not-quite-as-serious Igs too.
Here's the feed:
http://www.nature.com/nature/podcast/rss/nature.xml


A History of Nature

In a wonderful marriage of antiquity and technology, Nature has just released a special website about its history. Browse the timeline and watch the videos, then nominate and vote for your favourite Nature papers. (Mine are here and here, but I urge you to vote for this one. ;-)

Thomas Vander Wal Visits Nature

Last Friday we welcomed Thomas "Folksonomy" Vander Wal to Nature. We spoke about various areas of join interest, including of course tagging and social software. Thomas was also kind enough to give a talk to assembled staff. Here are my impressionistic and rather inadequate notes from that session.

What is a tag? A simple piece of metadata externally applied to an object. Used for sorting and as a hook for aggregating. Acts as a personal marker.

History of tagging: Lotus Magellan indexed content and allowed annotations to be added. Also Bitsy (sp?) Bitzi, but this wasn't collaborative. Del.icio.us got this bit right.

Folksonomy: the result of personal free tagging of pages and objects for one's own retrieval. Usually done in a social context, but done by the person consuming the information not a specialist. The value is that it derives from people's own understanding. Not really categorizing but rather connecting items and providing hooks so that they can be aggregated. Rashmi Sinha: Tagging taps into an existing cognitive process without adding much cognitive cost.

Folksonomy triad: Object, metadata (tag) and identity (user). Sometimes combined with community where different users settle on similar tags.

Folksonomies allow new objects to be discovered -- things I haven't tagged that have been tagged by other people who use the same terms as me.

Folksonomy versus taxonomy: Organisations create taxonomies to provide structure and make information easier to find. Users create folksonomies. Taxonomies and folksonomies overlap, but folksonomies go further -- in one [presumably representative] case 70% of tags were not in the taxonomy. Folksonomies allow us to validate taxonomies, identify gaps and provide new terms to put into the taxonomy. Taxonomies provide formal structures and efficiency, and also the foundations to apply new understanding.

Taxonomy is structured, efficient, and provides a sold foundation. But it's resource-intensive, non-emergent, and difficult to validate. Folksonomy is messy, provides poor 'findability' (but good 're-findability'), and is slow to emerge. But it's relatively inexpensive, emergent, and enables continual validation.

Who tags? Each day, 7% of Americans on the web use tags. Following the same trend as RSS -- not yet mainstream but growing.

Social vectors: Different groups refer to the same thing by different terms.

Business value of tagging: Tension between control, in-house, known (usually high) value, and consistency versus decentralisation, outsourcing, unknown value and emergence. Compared to approaches like abstracting, annotation, ratings, etc., tagging delivers relatively large value compared to the work involved.

Cold-start problem: Social software starts out as just software with no 'social' because there are no other users. Del.icio.us got around this by providing an incentive to individual users while at the same time enabling social behaviour.

Phases of interaction: Tagging, re-finding, exploring, searching, interacting.

Spheres of sociality: Personal, selective, collective.

The future: Portability and interoperability between services and devices.

[There followed a Q&A session during which I wasn't able to take notes.]

October 09, 2007

Second Nature Event: Prospecting for "Ethical Pharmaceuticals"

Firstly, apologies for taking so long to report on last week's event: Professor Philip Mellor gave a very interesting talk all about Bluetongue disease, how the spread has changed in recent years and how it corresponds to the changes in climate.

You can see the full write-up, including his slides over at Nature Network

Meanwhile, this week, we move to a different field altogether. Professor Sunil Shaunak is Professor of Infectious Diseases at Imperial College London and a co-founder of PolyTherics. He is dedicated to showing how academics can produce medicines for a fraction of the existing cost and proving that this approach represents an exciting new opportunity for making patented cost-affordable medicines available to patients who currently receive no treatment.

Join us for a discussion with Professor Shaunak who will explain how his method works, what success he has had so far and his hopes for the future. The event is free of charge, everyone is welcome and no specialist knowledge is required. To attend the event, you will need a Second Life account. To get one, go to Second Life. Second Life can sometimes be tricky for beginners: if you run into problems, feel free to contact me.

For experienced users: this event will use voice. For help activating voice, come along fifteen minutes early and we will help you then.

Title: Prospecting for "Ethical Pharmaceuticals"
Date: Thursday 11th October
Time: 11am SLT/PDT, 2pm EST, 6pm GMT, 7pm BST
Location: Second Nature Island
Contact: Joanna Wombat

More information:

Nature article about Professor Shaunak's initiative
CNN Feature
CNN Feature
Professor Shaunak's homepage

October 08, 2007

Aggregating scientific activity

Picture 2.png Here's my not-particularly-insightful prediction for the web next year: activity aggregation is going to be hot. It started with the Facebook news feed, piggybacked on the Twitter driven lifestreaming bandwagon and will finally reach maturity on sites like Noserub, Friendfeed and Google's new social networking platform.

For those unfamiliar with the concept here's the idea:

1. You sign up to an aggregator service

2. You tell it what your other usernames are (on Digg, Delicious, Wikipedia, Last.fm, wherever)

3. A page that aggregates all of your activity is created. When you post something to Flickr or edit a wikipedia entry etc. etc. it gets listed on this page automagically.

4. Your friends and contacts from the sites you listed in (2) are imported into the system and matched to other users of the aggregator.

5. You get access to their aggregated activity pages.

The point of all this is that you can track what your friends and colleagues are doing across multiple sites. Neat idea if you've got lots of web savvy contacts.

'Web savvy contacts' aren't usually something that researchers have in great abundance but I think that this idea has tremendous potential for scientists too. One of the problems with user generated content in science is that there's no established reward system for contributing to something like OpenWetWare, editing a UniProt entry, depositing a gene sequence or writing a good blog post.

The concept of microattribution - credit for contributions that aren't publications - is something that we've thought about before at NPG; Nature Genetics carried an editorial on the subject back in August (see blog coverage here and here).

The way funding agencies and tenure panels assess scientists isn't going to change any time soon, realistically, but activity aggregation can help in a small, immediate way by increasing the exposure that these potentially valuable but often overlooked contributions get.

It'd do this by collecting all of those contributions in one, easy to find place (the activity page) and describing how to credit the author for them.

Imagine that every scientist had a profile page which contained a filterable record of their scientific activity - submitted this gene sequence, renamed this Gene Ontology term, edited this protocol. Might this be a step towards making these contributions more valuable to somebody's career? It'd be a bit like having an detailed interactive resume.

We would also index contribution metadata, making the whole thing searchable so that it'd be easy to find new collaborators - you could get a list of everybody who has worked on a gene that you're interested in before, for example, ordered by the number of relevant contributions that they've made.

I linked to a Nature Network profile page before because hey, they'd be a good place to put this kind of thing. Network profile pages are pretty high up in Google rankings, usually (they have good Gooju?). Of course any system that could materially affect people's careers should be open and distributed - what if Nature went evil and started charging recruiters for access to your contributions? Or started charging you to remove the record of that embarrassing, drunken Wikipedia edit? It'd have to be possible to move the record of your contributions from one place to another.

Any thoughts?

October 05, 2007

DataNet, a call for proposals.

Peter Brantley wrote to us to let us know that the National Sciecne Foundation office of Cyberinfrastructure has just released a call for proposals for the creation of A Sustainable Digital Data Preservation and Access Network (DataNet). This is a 100 million dollar round of funding to be distributed to five groups over five years, with a possible extension to the funding following on from that. You can view the proposal here. Peter has created a DataNet group on Nature Network to "to share interests, proffered expertise, desires to collaborate, and solicit Q&A regarding the recent NSF solicitation."

After reading through the call for proposals it becomes clear how ambitious the NSF are in their thinking on this. The data under the remit of this call are any entities that can be represented digitally, from media files, to large databases to working code and tools. I've taken the following snippets from the proposal to illustrate where they are heading with this:

'... presents a vision in which “science and engineering digital data are routinely deposited in well-documented form, are regularly and easily consulted and analyzed by specialists and non-specialists alike, are openly accessible while suitably protected, and are reliably preserved.” The goal of this solicitation is to catalyze the development of a system of science and engineering data collections that is open, extensible and evolvable.' and '... to support analyses of data sets whose size or protection needs prevent their being moved to another site for analysis.'

They are concerned that the proposals presented provide solutions for managing digital resources over decades of time, and be able to adapt as technology changes. This proposal touches on so many issues that are of key interest. It ask questions of what the future of our digital heritage should look like, of how publishers should be streaming information into the collective pool of knowledge, how librarians should manage their collections, what the role of super-computing centres have to play in all of this. It is increasingly no longer sufficient for information to lie unconnected in one format, be that a journal article, database or a piece of code. Insight comes from the combination of data from diverse sources, and as more of these sources become digitised a unified system for accessing and combing these resources becomes more valuable, indeed possibly, necessary.

I'm really interested to see what proposals will emerge from this, so watch this space. The NSF are going to hold an informational meeting on the 6th of November and preliminary proposals are due in on the 8th of January 08. If you are interested in joining a discussion about this then head over to Peter's Nature Network DataNet group.

I also noticed that Jon Udell has some interesting thoughts about this here in which he suggests that something akin to his idea for lifebits may form part of the solution.

October 01, 2007

Second Nature Event: Bluetongue disease special

Picture 1.png




This week, the Second Nature events series brings a topical special event on Bluetongue disease.

Hot on the heels of another Foot-and-Mouth disease outbreak, a Bluetongue Disease outbreak was declared in the UK last week. This summer, there have been 3,000 reported cases of Bluetongue in Northern Europe alone. What is Bluetongue? How does it spread, why is it here now and where will it go next? And is it all because of climate change?

Join us for a special session with Professor Philip Mellor from the Institute of Animal Health at Pirbright for a discussion of all the details of Bluetongue, what we can expect from the outbreak and whether global warming is going to result in Bluetongue and other animal diseases becoming the norm.




Title: Bluetongue Disease special
Speaker: Professor Philip Mellor
Location: Second Nature Island
Date: Thursday 4th October
Time: 7am SLT, 10am EST, 2pm GMT, 3pm BST
Contact: Joanna Wombat

More info:
Bluetongue disease and Professor Mellor’s research group

"Nascent Web publishing efforts have their genesis in a burning need to say something, but their ultimate success comes from people wanting to listen, needing to hear each other’s voices, and answering in kind."
Rick Levine
The Cluetrain Manifesto

Subscribe

Subscribe to this blog's feeds:

[What is this?]

Recent Comments

Powered by
Movable Type 3.2