Long Tails: just how we roll
(interested in this kind of thing? Also check out Christina's forthcoming talk, via Bora)
If we assume that people blogroll sites that they think (a) are good and (b) are relevant to their audience then it seems fair to also assume that by aggregating and analyzing blogrolls from blogs tagged in Nature.com Blogs (did I mention Nature.com Blogs already? Did you sign up?) we can come up with some sort of 'top ten blogs' in each area.
I wrote some scripts to do the relevant scraping. Here are the results. Note that ranks can be tied - The Loom and Aetiology share fourth place, for example:
Across all subject areas
| Blog | Rank |
| Pharyngula | 1 |
| The Panda's Thumb | 2 |
| RealClimate | 3 |
| The Loom | 4 |
| Aetiology | 4 |
| A Blog Around The Clock | 5 |
| Cosmic Variance | 6 |
| Adventures in Ethics and Science | 7 |
| Respectful Insolence | 8 |
| The Intersection | 8 |
I left out the raw numbers for now (at this stage it's just an experiment) but can tell you that Pharyngula and The Panda's Thumb are way ahead of the competition... they're on twice as many blogrolls as Real Climate (also a very popular choice).
It all looks very Long Tail'ish...

... that is to say that there are a very small number of blogs on lots of blogrolls and a very large number of blogs on few blogrolls.
We're been tracking links between science blogs since Nature.com Blogs launched but there isn't enough data to do a proper comparison between blogroll popularity and incoming link popularity yet. I'd be interested to see what kind of correlation there is: do you link more often to the blogs on your blogroll? Are there some blogs that you add because you feel you should? Does much reciprocal blogrolling go on (almost certainly, I'd guess)? Is there a blogrolling equivalent to prominently displaying an unread copy of Ulysses on your bookshelf even though all you read is Stephen King?
(tables for individual subject areas are below the fold)
Chemistry
| Blog | Rank |
| Molecule of the Day | 1 |
| Useful Chemistry | 1 |
| Developing Intelligence | 2 |
| The Sceptical Chymist | 2 |
| Genomicron | 2 |
| easternblot.net | 2 |
| Stoat | 3 |
| petermr's blog | 4 |
| The Culture of Chemistry | 4 |
| chem-bla-ics | 4 |
Life Sciences
Physics
| Blog | Rank |
| Cosmic Variance | 1 |
| Bad Astronomy | 2 |
| Uncertain Principles | 3 |
| Biocurious | 3 |
| Good Math, Bad Math | 4 |
| Cocktail Party Physics | 4 |
| Angry Physics | 5 |
| Shtetl-Optimized | 5 |
| Backreaction | 6 |
| A Photon In The Darkness | 7 |
Bioinformatics

Comments
Hi, Euan. What do you mean when you say, "blogrolls from blogs tagged in Nature.com Blogs"? You mean you went to each Nature.com Blog, looked at the blogs in each one's blogroll, went to each of *those* blogs, and then tabulated the blogs listed in *their* blogrolls?
Posted by: Amos Kenigsberg | December 5, 2008 11:37 AM
Hey Amos,
Sorry, should have been a bit clearer about the methodology, which was basically:
1) download list of blogs (and their subject areas) from Nature Blogs
2) visit each one, pull out its blogroll (by downloading half a dozen posts published on different dates, working out which part of the pages never changed and then pulling out links from there)
3) filtering out non-blogs and blogs not listed in the Nature Blogs index
There was quite a lot of munging of URIs involved (so that scienceblogs.com/ matched www.scienceblogs.com , for example).
Posted by: Euan | December 5, 2008 11:44 AM
Nice.
But Neurotopia and Of Two Minds are not physics blogs, they are neuroscience blogs (more life science than anything else).
Posted by: Coturnix | December 5, 2008 06:13 PM
THere may be a serious flaw in your methodology. If you search for parts of the page that do not change you will never include any rotating blogroll. My blog roll includes about 300 sites, but displays only 50 at a time.
Posted by: Greg Laden | December 6, 2008 02:05 PM
Can I claim my blogs yet?
Is this affected by the fall in popularity of the blogroll (static) in favour of imported widgets for RSS readers (now standard on Blogger (Google) blogs but not, sadly, on Typepad as yet). I think bloggers and others are tending nowadays to subscribe and unsubscribe to blogs in their RSS readers and not bothering to update their static blogrolls. In the "old days of blogging", blogrolls were ways to get bloggers to refer to each other, etc -- but now, many coversations go on separately at Friendfeed, twitter, Nature Network, etc, as well as at the blog itself.
Must be hard to capture a lot of this ephemeral activity for analytical purposes.
Posted by: Maxine | December 6, 2008 04:24 PM
Another point: Nature Network blogs don't even have blogrolls. Will be interesting to see what happens to your analysis when they do. But for the moment, they are presumably disproportionately excluded from a blogroll-based analysis such as this?
Posted by: Maxine | December 6, 2008 04:40 PM
Maxine: there's nothing to stop Network blogs being blogrolled by others which would make them show up here - OTOH I bet you're right in that you're far less likely to be blogrolled if you don't maintain a blogroll yourself.
Greg: you're right, the methodology definitely can't capture rotating blogrolls. I don't think it's too serious a flaw if the numbers of non-rotating blogrolls are big enough but it's a good point and something that would definitely be worth tackling properly if we wanted to use ranks like these for anything formal. It'd be dumb to not use all the data available.
What Bora mentioned is definitely a serious flaw: those aren't physics blogs! Must have a tag gremlin... will correct tomorrow.
Posted by: Euan | December 7, 2008 01:09 PM