HUGO: The language of our genes
Ask people what they associate with Finland and the one thing everyone mentions is they’ve heard the language is fiendishly difficult. It certainly is--especially, if, like me, you are trying to get an uncooperative ( and non-English speaking) ticket machine to sell you a train ticket back to your hotel after a long day at the Human Genome Meeting in Helsinki. Still, the experience does make me sympathise with the scientists who are trying to decode the information encrypted in our DNA. Because if my Berlitz pocket phrasebook is anything to go by, it seems as though the human genome is written in Finnish.
I can normally get by in most parts of the EU even if I don’t speak the local language thanks to my collection of English, French and German vocabulary--I can often recognise similar words here or there. This, Berlitz helpfully explains, is because the languages in most of these places are part of the large Indo-European group of languages. So the fact that the ticket machine also spoke Swedish meant I could work out that the wretched thing was trying to sell me some sort of monthly season ticket, even though none of its Finnish offerings were listed in my book.
Finnish, on the other hand, belongs to the small Finno-Ugrian group, which includes Estonian. Unlike the Indo-European languages, which emphasise word order to indicate grammatical relationships, Finnish relies on a rich collection of word suffixes to indicate meaning. This means its grammar is incredibly complex. There are at least a dozen different cases (Berlitz flatly refuses to tell me about the other 3, perhaps in case I take fright) and rules about vowel harmony and inflexion mean that the suffixes are nor just tacked on mechanically to words. For example, adding the suffix “mme” to “auto” gives “automme”, meaning “our car”.
Something similar seems to be happening in the genome too. The order of genes on a chromosome seems to matter less than their context and the DNA suffixes that surround them: elements that flag the start of a gene and help control its activity, enhancers and repressors that likewise influence expression, and not forgetting the epigenetic codes that add meaning and depth to the sequence of genetic letters. And given that, unlike Finnish, there is no helpful Berlitz guide to help us read it, understanding the full range of expression and shades of meaning in the human genome is going to take some doing.
Which kind of puts my little troubles back into perspective. And anyway, it turns out I was shouting at the wrong machine--just across the hall was a one that spoke English and dispensed my ticket and sent me on my way--to get ready for a new day of talks on how the great translation effort is going.

Comments
Interesting parallel. Actually, this behaviour of genes does not surprise me. It is a very good way to compress information in a small packet. As language does.
Just one thing, though, many Indo-European languages also rely on word suffixes to indicate meaning, like Lithuanian, and German, but to a lesser degree. And if you think Finnish is complex with its dozen suffixes, you should try Magyar (Hungarian), which has more than twenty.
Posted by: Marc Andre Belanger | June 2, 2006 01:30 PM
"One gene, one name" is an important concept for accurate communication in genetic research. The HUGO Gene Nomenclature Committee (HGNC) has so far been responsible for naming one third of the genes estimated from the draft of the human genome sequence.
Posted by: Scott Brison | December 14, 2006 08:17 AM
Very good and useful article. Thank you very much!
Posted by: Anna Kowalska | July 7, 2007 10:52 PM
I know what you mean but unfortunatelly it is not usefull for me
Posted by: Luxury | December 11, 2007 10:50 AM