IMHO the way we present genomic and proteomic data on the web sucks.
Part of this is down to the fact that our basic visualizations are a bit lame. Not the one-offs you see rendered for the front of magazine covers or science documentaries on TV, the everyday ones you get in journal articles and the big genome browsers. The ones scientists actually use. Pages like this. Seriously, how much of that page is actually useful to any one researcher? Does anybody wonder why people need training courses to use genome browsers? Does it really make full use of the web as an interactive medium?
This isn’t a dig at Ensembl, incidentally. Ensembl is an excellent, progressive resource and their last application note hinted at cool things to come. It’s more of a general complaint: why are we stuck representing genomic regions with flat images of lines and boxes (with the occasional flag or lollipop)? Can’t anybody think of a better metaphor? Try zooming out on the Ensembl page above. The features that make sense as individual elements when you’re looking at a single gene don’t scale as lines and boxes and the page becomes a mess.
Speaking of zooming out, why is everything static? Ensembl have already brought in dynamic image loading to some extent and the GBrowse prototype from Ian Holmes’ lab shows a lot of promise, but we’re still a long way away from any Google Maps style scrollable, zoomable genomes. It should be painless to jump a megabase to the right or left of wherever you’re currently looking in a genome browser, but that’s not currently the case.
When was the last shift in the way that we represented sequence alignments? NCBI’s BLAST just relaunched with a spiffy new interface, but the results still look like they’d be at home in a web browser circa 1990, which coincidentally was when sequence logos were first introduced.
Progress in visualizing biological networks is just as poor: is there ever a good reason to produce static images of large scale protein-protein interaction networks (where the text is too small, everything is a mess of connecting lines, attempts at grouping proteins together are – by necessity – one dimensional, etc. etc.)? Some authors think so. Network visualization software is already out there, somebody just needs to spend some time adapting it.
Ben Fry at MIT’s Media Lab – now well known for Processing – produced some great stuff while a grad student, but that was back in 2004. Does anybody know of any other groups or individuals who are currently exploring better ways of looking at life science data?
Perhaps issues of aesthetics and design are a low priority for the scientific community online and it’s more important to just get the data out there for people to use? As time goes on and we accumulate more and more data – while stuck with systems not designed to handle it – that could prove to be a problem.