[Research highlight] Multicellular computers

Elaborate computation tasks can be performed by distributing the work across interconnected elementary information processing units. This principle underlies not only the operation of integrated electronic circuits, but also of many biological processes including development and, of course, the activity of the brain.

In two reports recently published in Nature, Chris Voigt’s lab and the team lead by Ricard Solé and Francesc Posas report the construction of synthetic biological circuits performing distributed multicellular computation (Tamsir et al, 2010, Regot et al 2010). In the implementation presented by Tamsir et al, individual E. coli colonies carrying a simple genetic cascade (NOR gate) are interconnected via quorum sensing signaling molecules to perform complex operations (XOR, or EQUAL). Similarly, Regot et al build multicellular circuits (e.g. multiplexer or 1-bit adder with carry) using mating pheromones to chemically ‘wire’ together engineered yeast cells that perform a variety of basic 2-input logical functions.

These works show that compartmentalizing elementary synthetic circuits enables combinatorial and flexible assembly of complex circuits and can improve the robustness of the resulting computation. What could be the next step? Looking at the operation of the brain, arguably the most powerful living computing device known, one is tempted to suggest that narrowing the diffusion range of the chemical cell-to-cell transmitter within ‘synthetic synapses’ would facilitate the miniaturization of multicellular computing networks and potentially open the door to scalable designs of arbitrary complexity.


Tamsir A, Tabor JJ, Voigt CA (2010). Robust multicellular computing using genetically encoded NOR gates and chemical ‘wires’. Nature doi: 10.1038/nature09565

Regot S, Macia J, Conde N, Furukawa K, Kjellén J, Peeters T, Hohmann S, de Nadal E, Posas F, Solé (2010). Distributed biological computation with multicellular engineered networks. Nature doi: 10.1038/nature09679

[Research highlight] Mycoplasma rebooted

Upshot of a series of four papers published over the last years (Gibson et al, 2010, Lartigue et al, 2009, Gibson et al, 2008, Lartigue et al, 2007), J. Craig Venter’s team now reports the successful transplantation of a chemically synthesized genome into a host bacterial cell (Gibson et al, 2010). As proof of principle, a slightly altered Mycoplasma mycoides genome (JCVI-syn1.0) was synthesized, assembled and transplanted into M. capricolum recipient cells.

This achievement results from the integration of several techniques developed in previous works: 1) a hierarchical strategy to assemble, via homologous recombination in yeast, a full genome from chemically synthesized overlapping fragments (Gibson et al, 2008); 2) a method to transform a full genome into a host cell and replace the recipient genome by the donor genome (‘transplantation’, Lartigue et al 2007); 3) a method to transplant DNA engineered in yeast into bacteria without being inactivated by the host restriction system (Lartigue et al, 2009). Finally, in the last work, systematic debugging methods were needed to identify a single base pair deletion that prevented productive transplantation (Gibson et al 2010).

The experiment represents certainly a highly symbolic milestone. A fascinating potential of this technology, if generalized and automated, is to enable the introduction of many genomic alterations simultaneously and, thus, to be able to reprogram cellular phenotypes with non-trivial genetic combinations that would have been impossible to identify with a sequential gene by gene approach. In this sense, while technically and ‘philosophically’ distinct, Venter’s approach appears complementary to multiplexed mutagenesis technologies that introduce simultaneously multiple modifications in a target genome (Wang et al, 2009).


Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang RY, Algire MA, Benders GW, Montague MG, Ma L, Moodie MM, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova EA, Young L, Qi, ZQ, Segall-Shapiro TH, Calvey CH, Parmar PP, Hutchison CA, Smith HO, Venter JC (2010). Creation of a Bacterial Cell Controlled by a Chemically Synthesized Genome. Science advance online publication doi: 10.1126/science.1190719

Lartigue C, Vashee S, Algire MA, Chuang RY, Benders GA, Ma L, Noskov VN, Denisova EA, Gibson DG, Assad-Garcia N, Alperovich N, Thomas DW, Merryman C, Hutchison CA 3rd, Smith HO, Venter JC, Glass JI (2009). Creating bacterial strains from genomes that have been cloned and engineered in yeast. Science 325:1693

Gibson DG, Benders GA, Andrews-Pfannkoch C, Denisova EA, Baden-Tillson H, Zaveri J, Stockwell TB, Brownley A, Thomas DW, Algire MA, Merryman C, Young L, Noskov VN, Glass JI, Venter JC, Hutchison CA 3rd, Smith HO (2008). Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science 319:1215

Lartigue C, Glass JI, Alperovich N, Pieper R, Parmar PP, Hutchison CA 3rd, Smith HO, Venter JC (2007). Genome transplantation in bacteria: changing one species to another. Science 317:632

Wang HH, Isaacs FJ, Carr PA, Sun ZZ, Xu G, Forest CR, Church GM (2009). Programming cells by multiplex genome engineering and accelerated evolution. Nature 460:894

Impact Factors 2008

The new Impact Factors 2008 were just released by Thomson Reuters (2008 Journal Citation Reports). We are delighted to announce that Molecular Systems Biology continues its progression, with an Impact Factor 2008 of 12.243.

We address a warm thank you to all our authors and reviewers for this wonderful success, which reflects the current extraordinary dynamism and enthusiasm in the fields of systems biology, synthetic biology and systems medicine!

The limitations of the Impact Factors (IF) have been largely discussed. In particular, it might be questionable to use IFs to rank journals with highly variable scopes, audiences and citation patterns. Moreover, article-centered metrics (such as individual citations, number of download, highlights in N&V, etc…) might be more appropriate to evaluate the contributions of individual researchers, rather than solely relying on the proxy provided by journal-based citation indexes. Nevertheless, when considering the variation of IF over time for a given journal, the impact of some of the confounding factors mentioned above might be reduced, at least to some extent. To facilitate exploration of the progression of IFs over the last five years, I include at the end of this post a Google Motion Chart to visualize IFs of a (rather subjective) selection of journals related to the fields of molecular and cell biology.

One observation that becomes apparent when toying around with this visualization, is that relatively few journals–in this selection!–see their IF raising over a period of 5 years, whereas many seem to be subject to a progressive erosion. This is also visible if one clusters the normalized time profiles, showing that the downward profile (in red) is frequent, at least within the selection used for the Motion Chart below (each curve is the cluster’s center with a thickness proportional to the number of journals in this cluster):

Why is that? It is hard to know. Perhaps, it might reflect some global effects affecting many journals at the same time: proliferation of new journals, changes in the pattern of citations directed to reviews rather than primary research, shift to citations of medically and clinically-oriented journals to highlight the medical relevance of the citing paper, etc… On the more positive side, those journals with upward progression (green curve above) may provide pointers to particularly dynamic fields.

In any case, given the above global trends, we are even more happy to open a bottle of Champagne to celebrate and enjoy the moment… 🙂

For an easy start with the exploration of the data, select ‘Impact Factor’ for the Y axis, ‘Time’ for the X axis, color by ‘up vs down’, ‘same size’ in the ‘size’ menu, check a few of your favorite journals (don’t forget to click on Mol Syst Biol!) and check the ‘Trail’ box. Press the ‘play’ button to start the animation. Interesting visualizations are also possible with the bar chart option (Click on second tab on top). See also instructions on the relevant Google Docs help page. Have fun!

Legend:

  • ‘IF’: impact factor
  • ‘IF-IF2004’: the Impact Factor 2004 (or the first available) was subtracted from all the other, to facilitate visualization of the progression
  • ‘up vs do’: +1 if IF2008>IF2004, -1 otherwise
  • ‘cluster #’ & ‘profile type’: 0=undefined because missing values, 1=profiles goes up then down, 2=down then up, 3=down, 4=up

The end of news, the end of reason

Guest post by Holger Breithaupt, Science & Society Editor, EMBO reports, Heidelberg

Aside from what Waldorf & Statler make of the internet, it is the greatest source of information humanity has ever created; larger than the Vatican Archives, the Library of Congress and all public and university libraries combined. And it’s fast. I don’t have to wait for the news on TV or the daily newspaper to tell me about the US government’s latest reaction to AIG’s bonus payments: the internet, in particular the blogosphere or that latest spawn of it, twittering, gives me real-time news, 24 hours a day. Why then, would we still need news on paper, on TV or on the radio?

Given the power of the internet, there are actually not a few who think that it heralds the demise of the newspaper (Newspapers and Thinking the Unthinkable, Shirky, 2009) and even of journalism (Filling the Void, Nature editorial, 2009). Sure, why bother trying to unfold the New York Times during rush hour in the subway to read a 3500-word feature, if I can download 140-character information tidbits on my iPhone? I don’t even have to buy a newspaper or wait for the 8 pm news in the first place: RSS feeds, search engines, ToC alerts or whatever technology spoon-feed me the newsbits that I’m interested in from the sources that I like.

And that’s exactly the problem. As Nicholas Kristof pointed out, we mainly use the internet to reinforce our prejudices and opinions while it makes it easier for us to ignore contradictory arguments (The Daily Me, Kristof, 2009). I myself plead guilty of such behaviour: while I read and enjoy Frank Rich’s column each week, I shunned William Kristol. On the other hand, while I was reading the newspaper the other day, I stumbled upon an article that explained why paying big bonuses to AIG managers who helped run the company aground is not such a bad idea (The Case for Paying Out Bonuses at A.I.G., Sorkin, 2009); I still disagree, but at least I feel I have a better understanding of the issue.

What is at stake here is our ability to reason, which, as I understand it, means forming your own opinion on a given topic–or maybe even changing it–after listening to the diverse pros and cons. Instead, as Kristof noted, the way we use the internet largely serves to harden our pre-formed beliefs unless we deliberately make the effort of searching and reading the arguments we don’t like to hear. Newspapers, TV and radio and good journalism are the antidote: they provide–if they live up to the task–an oversight of arguments and they expose us to topics and opinions that we would just ignore or not even become aware of and thus broaden our horizon. Claiming that they are no longer needed in the brave new world of blogs, social networks and twittering means that we give up an important opportunity to make up our mind.

Keystone Symposium – Omics Meets Cell Biology (II)

thumb090202a.jpg

Before I carry on with a summary of the second part of the Keystone Symposium ‘Omics Meets Cell Biology’, I should clarify that this post and the previous one dedicated to this conference are not intended to provide an comprehensive account of all the talks but rather to communicate some general (and subjective) impressions of the meeting. To keep these posts reasonably short (and sometimes due to a lack of memory…), I had to omit several of the excellent presentations given at this meeting. The full program and complete list of speakers is available at the Keystone Symposium website.

Many of the presentations given during the second part of the meeting reported findings derived from cell-based high- or medium-throughput functional screens, most of them relying on RNAi-mediated knock-down. Here is an overview of the screens presented during this meeting, illustrating by their diversity in scope and scale the versatility of this method:

Focus # genes tested Type Speaker
autophagy 21’000? RNAi M Lipinski
sensory organ dev. 20’000 RNAi J Mummery-Widmer
cell polarity 16’000 RNAi J Ahringer
imatinib modifiers 9500 (pooled) RNAi D Sabatini
viral entry 4000 RNAi L Pelkmans
cell-cell contacts 2000 RNAi T Pawson
cell migration 1000 RNAi J Brugge
centrosome 113 RNAi L Pelletier
bipolar spindle 45 RNAi R Medema
DNA repair RNAi D Durocher
neuronal differentiation 700 TF overexpression M Snyder
gene-centered TF location yeast 1-hybrid library M Walhout
protein degradation reporter library S Elledge

Perhaps not surprisingly, many speakers emphasized that RNAi screens invariably need to be followed up by time-consuming and tedious validations. The off-target problem in mammalian cell-based RNAi screens appears also to be taken very seriously and it was reported that from 4-7 siRNA directed against the same gene were necessary to reach a good level of confidence. In view of the increasing number of RNAi-based functional screens, standards for the description of such experiments (eg. MIARE, MIACA) are likely to become increasingly useful.

In systems biology, network models are often central for the interpretations of omics data related to molecular interactions and they allow to generate biological insights which are different from those derived from the more classical screening-mechanistic-dissection paradigm. In this regard, Uwe Sauer presented exciting work on the relationship between transcriptional regulatory networks, protein expression and the state of the yeast metabolic network. Using a combination of genetic approach and drug perturbations, a series of parallel ‘fluxomic’ and metabolomic measurements revealed that metabolic fluxes, in contrast to metabolite concentrations, remain robust to perturbations and are apparently affected only by a handful of transcription factors in a given condition at steady state. At the computational level, integration of different types of data represents significant challenges. For example, it is far from trivial to find ways to exploit the information contained in interaction networks and integrate it with other types of large-scale molecular measurements. Trey Ideker exposed an efficient solution to this problem within the context of microarray profiling of breast cancers and showed that expression data can be combined with information on protein physical interactions to define improved and biologically meaningful pathway-based biomarkers for the classification of metastatic vs non-metastatic tumors.

While superposing parallel datasets leads to a ‘vertical’ integration of networks, Marian Walhout presented an approach to integrate ‘horizontally’ transcriptional and miRNA-dependent regulatory links and map a composite transcription factor/miRNA regulatory network in Caenorhabditis elegans. In this elegant work, the yeast one-hybrid assay was used as a gene-centric screening method to identify regulatory links between hundreds of transcription factors and promoters of both miRNA genes and genes encoding transcription factors. Closing the loop, the network was completed by computationally predicting the transcription factors potentially targeted by miRNAs. Interestingly, the resulting network showed numerous composite motifs including negative feedback loops (TF → miR –| TF), which are otherwise under-represented in pure transcriptional regulatory neworks.

Completion of network models may require tedious and repetitive work. To the question “who will fill the gaps?”, Steve Oliver replied: “a Robot Scientist”. He showed that an actual implementation of such a robot is able to iteratively use a computational model of the yeast metabolic network to automatically design informative experiments, perform them and use the results to extend the model. In an effort to provide a genome-scale overview of the molecular interactions that underly regulation of gene expression, Tim Hughes presented a variety of microarray-based technologies to systematically map transcription factor-DNA, nucleosome-DNA and protein-RNA interactions. The latter results were particularly intriguing given that the high-throughput identification of targets of RNA-binding proteins remains a relatively unexplored route and may reveal novel insights into the complexity of post-transcriptional regulation.

To conclude on a somewhat different note, it was also interesting to observe that an increasing number of studies were accompanied by extensive web resources providing access to the respective datasets:

Resource Lab
PhophoPep R Aebersold
Human Protein Atlas M Uhlen
3Dcomplexes.org S Teichmann
Nature Cell Migration Gateway J Brugge
EDGEdb.org M Walhout
CellCircuits T Ideker
STRING C von Mering

This situation underscores the need of a proper infrastructure to host and share (or publish?) large datasets in biology and the central role of web technologies in this regard. In view of the proliferation of biological databases, I wonder whether it might be helpful to have general recommendations on some minimal requirements for this type of databases—eg. type of searching, visualization, data integration functionalities, existence of a (web) APIs, download of datasets, possibility to integrate external datasets, etc…? Or would perhaps something like a ‘Minimum Information About a Biological Database’ be useful to specify the capabilities of databases? One may also dream that these databases will become progressively interoperable and eventually include web-based APIs facilitating programmatic access to the information stored, ultimately sending Omics in the Cloud

thumb090202b.jpg

And, oh yes, the slopes were very nice, even though, I have to admit the air was thin and a little fresh…

Keystone Symposium – Omics Meets Cell Biology (I)

pic1-small.JPGAt the Keystone Symposium ‘OMICS Meets Cell Biology’, held this week in Breckenridge, Colorado, attendees had initially to face two major challenges: the first was to survive the cocktail mixing jet lag and altitude sickness and the second one—oh, it hurts!— was to resist the temptation to just forget all about science and focus exclusively on the concepts revolving around snow, slopes and fun sports…

In any case, those who survived this harsh test were highly rewarded by attending an extremely exciting meeting, organized by Ruedi Aebersold and Tony Pawson, showcasing the impact of genome-wide and high-throughput technologies, the so-called ‘omics’, in cell biology.

After the two first days of the meeting, dedicated to ‘cell signaling’ and ‘sub-cellular organization’, a series of impressive talks had already delivered a clear and strong message: beyond generating comprehensive ‘part lists’, omics data lead to important and novel biological insights when integrated with functional and phenotypic data and when applied in experiments addressing well defined aspects of the biology of the system under study. This was particularly well illustrated in the talks dedicated to signaling, which all reported on analyses of well defined systems: ephrin-Eph receptor bidirectional signaling in cell-cell contact (T. Pawson), insulin signaling and growth regulation (E. Hafen), notch signaling and sensory organ development (J. Mummery-Widmer), cytokines and hepatotoxicity (B. Cosgrove), Rho signaling & cell migration (C. Bakal).

I have the feeling that this transition from descriptive catalogs to functional and mechanistc insights can be envisioned as the result, at least in part, of two series of developments:

First, experimental design is evolving and an increasing number of projects combine and integrate functional readouts with genetic approaches and high-throughput molecular measurements. For example, Tony Pawson illustrated how the integration of quantitative (SILAC) proteomics, phenotypic siRNA screens and protein complex identification could shed light on the components and mechanisms involved in ephrin-Eph receptor bidirectional signaling and their impact on cell-cell contacts. A combination of quantitative proteomics and genetic approaches was illustrated by Ruedi Aebersold, whose lab is charting a comprehensive kinase-substrate network in yeast by systematically performing quantitative proteomics on deletion mutants of all kinases and phosphatases. Other experiments link even more intimately, by design, systematical perturbations and molecular measurements to phenotypic outcome. Ben Cosgrove presented such work in the context of the study of drug hepatotoxicity. Systematical measurements of the phophorylation status of 17 signaling proteins and monitoring of cell death rates were performed in HepG2 cells under a variety of cytokine stimulation conditions. Multi-variate statistical analysis enable then to construct correlative models, which have not only predictive power but also reveal key players in the process and provide insight into how signaling components contribute to the phenotypic outcome. The power of data integration was also beautifully demonstrated in the work of Jennifer Mummery-Widmer, who performed genome-wide and tissue specific RNAi screens in Drosophila to identify modifiers of the notch signaling pathway. Integration of the genes identified in the screen with a map of known genetic and physical interactions resulted in a network model whose predictive power was exploited to identify and validate in vivo novel regulators of notch signaling.

Second, the technological platforms are maturing, data quality is increasing and protocols are streamlined, making these technologies progressively more accessible. This might be particularly to relevant for mass spectrometry proteomic approaches, which were omnipresent in the signaling talks. One of the consequences of a relative and progressive ‘democratization’ of MS proteomics platforms is that their application is not obligatorily restricted anymore to an initial exploratory phase traditionally aimed at providing an unbiased view of a particular system, but can now also be engaged in follow-up, often more focused, investigations to gain deeper mechanistic insights. An example of this was provided by Ernst Hafen who presented his work on growth regulation in Drosophila and showed data on a genome-wide and tissue-specific mutagenesis screen aimed at the identification of modifiers of growth regulation. Selected hits of the screen were then analyzed further in time course experiments upon insulin stimulation and mass spectrometry identification of TAP co-immunoprecipitated protein complexes could reveal the nature and dynamics of signaling complex assembly. One can thus predict that further development of optimized omics technologies for targeted follow-up experimentation will have a profound impact in molecular and cell biology.

Mass spectrometry based proteomics was clearly one of the predominant platforms in many of the studies presented during the sessions devoted to signaling. It was therefore particularly fascinating to listen to Mathias Uhlen’s talk, who emphasized the need for complementary approaches based on affinity probes and presented foundational work towards antibody-based proteomics. The scale of the this work is such that it is hardly possible to summarize it in just a few sentences. Fortunately, the resource resulting from this enormous effort can be consulted directly online at the Human Protein Atlas portal. I will only add that Mathias Uhlen estimated that this resource will be able to provide quality controlled antibodies for 50% of human proteins within the coming years and that a first draft of the complete human proteome might be ready around 2014!

Beyond omics based on high-throughput measurements at the molecular level, one very exciting development is the application of imaging techniques for automated measurements of cellular and cytological parameters. Lucas Pelkmans showed that measurements of local cellular features (eg nucleus size, local density, mitotic stage, cell edges etc…) at the single cell level could be correlated to various cellular activities such as viral entry, clathrin distribution etc… He insisted that accounting for such local population parameters may have considerable implications for the interpretation of siRNA screens given the unavoidable heterogeneity of cellular populations. This strategy was then applied in the context of a large-scale siRNA screen for modifiers of viral entry performed on 8 different viruses. Cluster analysis of the resulting hits beautifully reveals a hierarchical ‘functional phylogenetic’ tree of the various virus strains according to the subset of cellular activities required for their entry. This information could in turn be used for the identification of a novel regulatory mechanism of viral entry essential for most of the viruses tested.

The role of neutral mutations in the evolution of phenotypes

Research highlight by Pedro Beltrao, University of California, San Francisco

MSB Research HighlightsIn a recent opinion piece, Andreas Wagner tries to reconcile the tension between proponents of neutral evolution and selectionism (Wagner 2008). He argues that “neutral mutations prepare the ground for later evolutionary innovation”. Wagner illustrates this point using a network model of genotype-phenotype relationships (Wagner 2005). In a so-called ‘neutral network’, nodes correspond to distinct genotypes associated with the same phenotype and are connected by an edge if the respective genotypes differ only by a single mutation event (eg point mutation). Examples of neutral networks include different genotypes coding for RNA or protein structures. In this representation, highly connected networks correspond to robust phenotypes that are not very sensitive to changes in genotype. Wagner notes the zinc finger fold as an impressive example of a highly connected neutral network as its structure remains essentially the same even after mutating all but seven of its 26 residues to alanine.

Using this model, Wagner describes how highly robust phenotypes can lead to faster exploration of the genotype space. He further proposes that evolution of innovation occurs via cycles of exploration of nearly neutral spaces (dubbed neutralist regime) followed by a reduction in diversity once a new phenotype of higher fitness is discovered (selectionist regime).

Although these models and ideas were mostly developed using models of sequence to structure relationships, Wagner cites several examples suggesting that these concepts are equally valid for cellular phenotypes that depend on molecular interactions (ex. gene expression patterns).

As Wagner points out, in order to understand the evolution of innovation we must fully understand the mapping between genotypes to phenotypes. This is why it is important to continue to develop richer evolutionary models to link changes at the DNA level with changes in molecular structures, interactions and ultimately phenotypes with a quantifiable impact on fitness. This is an area where systems biology should play an important role.

Models of RNA and protein structure stability upon mutation have existed now for some time (Hofacker et al. 1994, Guerois et al. 2002). More recently the study of large amounts of genomic information and/or systematic interactions studies are providing us with accurate models for different types of molecular interactions (Berger et al. 2008, Burger & van Nimwegen 2008, Chen et al. 2008). In parallel to these, theoretical analysis has been use to aid in the understanding of cellular phenotypes (i.e. cell-cycle, signaling pathways etc) (Tyson et al. 2003). Connecting these different layers of abstraction is an important challenge that will allow us to better understand the origins of biological innovation.


Berger MF et al. (2008). Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133:1266-76

Burger L & van Nimwegen E (2008). Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method. Mol Syst Biol 4:165

Chen JR et al. (2008). Predicting PDZ domain-peptide interactions from primary sequences. Nat Biotechnol 26:1041-5

Guerois R, Nielsen JE & Serrano L (2002). Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320:369-87

Hofacker IL et al. (1994). Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie / Chemical Monthly 125:167-188

Tyson JJ, Chen KC & Novak B (2003). Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr Opin Cell Biol 15:221-31

Wagner A (2005). Robustness and Evolvability in Living Systems. Princeton University Press

Wagner A (2008). Neutralism and selectionism: a network-based reconciliation. Nat Rev Genet 9:965-974

SciFoo: scientific fireworks

In his list of eight ‘generative’ values (Better Than Free), Kevin Kelly includes ’embodiment’–the actual physical realization of an item or event which could be otherwise freely distributed over the web. While we are all ‘hyperlinked’ on the Internet, the value of those unique qualities that cannot be generated or “copied” on the web is dramatically increased. The type of intense emulation and shared excitement sparked at the recent Science Foo Camp (SciFoo 2008), organized by Nature, Google and O’Reilly, gave a wonderful example of the unique value of direct human exchange during an exclusive event bringing together roughly 200 top scientists, ‘geeks’ and other technologists at the Googleplex in Mountain View, California.

SciFoo is a so-called ‘unconference’: there is no program or more precisely, as Timo Hannay explained during the opening of the conference, the attendees are the ‘program’. The actual schedule was defined only on the first evening in a purposefully chaotic process by anyone who wished to organize a session on any topic. For the next two days, in a festival of parallel sessions, astrophysicists, ‘googlers’, technologists, molecular biologists, taxonomists, game designers, flying car constructors, publishers, thinkers and (some) dreamers discussed and exchanged ideas with great enthusiasm and a rare intensity and openness.

Needless to say that deciding which session to attend was close to impossible… In any case, I ended up following three types of talks: a series on systems biology related topic (data integration, machine learning, personal genomics, baroque structure of the transcribed genome), several (of many) sessions focused on the theme of open data/science and finally some more eclectic sessions (only from my standpoint, of course) on diverse topics such as the foundations of the concept of time in physics, on some demonstration of very simple yet powerful Python scripting exercises to analyze text and the potential of game design to harness our ‘cognitive surplus’. I cannot possibly summarize all the talks, interactions and impressions gathered at this meeting, but here are a few subjective excerpts:

  • There were quite a few sessions on open science and open data. Ernst Hafen made a strong case for the need of a unique AuthorID that would help in tracking the multiple aspects of researchers’ scientific activities. With regard to data, Google announced that a new service will soon be launched, Google Research Datasets, offering to host, for free, large datasets of any type. The service will allow inclusion of some minimal meta-data about the submitted datasets and will provide a mechanism to define a delay before the dataset is made publicly visible. This will probably become a very simple and convenient way for storing data (in particular if a useful API is developed), so convenient in fact, that we may have to be a little careful that it will not turn into a temptation to bypass the ‘minimal information…’ standards usually required by traditional public databases.
  • George Church provided an overview of the Personal Genome Project (PGP) and described the type of biological data that will be integrated with the genomic and genetic information collected from consenting PGP volunteers: analysis of the transcriptome of pluripotent stem cells derived from the subjects; sequence of the repertoire of recombined V-D-J regions in immune cells (‘VDJome’) to exploit correlations between given V-D-J sequences and antigen-specific stimulations; characterization of the microbiome used as a tracer of the environmental and physiological conditions; record of phenotypic traits and disease conditions using controlled vocabularies. Finally, George also emphasized the exponentially decreasing cost of sequencing, which will not only make large scale sequencing of full personal genomes feasible but will also potentially open entire new fields of applications based on massive DNA sequencing.
  • Lee Smolin talked about the nature of the concept of time in physics and investigated the question of whether our perception of time as the ‘experience of successive present moments’ is ‘real’ or, alternatively, an emergent property of the laws of physics. I cannot pretend I followed the entire argument, but I learned that the mathematical representation of the physical reality involves the geometrization of time (as one of the state space’s dimensions), leading in fact to a representation devoid of temporal flow (somehow the clock has to be outside the system). To this geometrical representation, physical laws are associated and applied to initial conditions. If I did not misunderstand it, it appears that this approach used in physics might have to be considered as approximative because it may only be valid for subsystems of the universe whereas it might not be appropriate for a true cosmological theory of the entire universe, with possibly disturbing consequences on the nature of physical laws…
  • Believe it or not but music can be ‘geekified’ as well: Chris diBona, later in the evening, brought his tenori-on for a fun demonstration. I want one of those!

The meeting ended with some final scientific fireworks, when some of the speakers gave a series of brilliant 2 min summary talks, providing a colorful overview of the many sessions we inevitably had missed. I have to admit that I like fireworks and I would certainly have enjoyed having a little more of this final kaleidoscopic view of science. Clearly, the authentic value of this conference lies in the unique and direct human interactions, but I wish there would be nevertheless some way–perhaps by using this last session in some form of outreach action–to disseminate this pure joy of scientific diversity and curiosity to a broader audience.

Credits: illustrations from Bob Lee, Flickr, some rights reserved

Soon Sci Foo!

A last very quick post before going on vacation (Swiss Alps…). In two weeks I will have the great privilege to attend the mythic SciFoo ‘un-conference’ at the Googleplex in Mountain View, California. Many ideas of exciting sessions are already circulating. I would just like to add my support to Cameron Neylon’s proposal for a discussion around the issue of building a ‘Science Data Commons’. The availability and ‘integrability’ of scientific data represent probably some of the major challenges in scientific communication and, obviously, I would be excited to see if, from the discussions at Sci Foo, some ideas will emerge on how scientific journals can take concrete and pragmatic steps to help making scientific data readily available in a useful form.